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PHEFACE 

THE  aim  of  this  book  is  to  give  an  account  of  the  method  of 
Least  Squares,  without  entering  into  elaborate  descriptions 
of  instruments  or  of  experimental  methods.  The  reader  who 
wishes  to  apply  the  method  to  his  own  subject  can  find  the 
experimental  details  in  the  textbooks  relating  to  his  own  special 
subject,  and  any  attempt  to  include  such  details  in  the  present 
volume  would  have  meant  a  considerable  addition  to  its  size. 
Hence  it  has  been  the  author's  aim  to  show  how  to  obtain  the  best 
interpretation  of  the  results  of  experiment,  without  consideration 
of  the  way  in  which  these  results  are  to  be  obtained. 

It  cannot  be  too  strongly  insisted  upon  that  the  methods  of 
Least  Squares  cannot  in  any  way  improve  upon  the  actual 
observations.  The  application  of  these  methods  to  a  large  number 
of  carelessly  conducted  experiments  cannot  in  general  be  expected 
to  yield  results  as  reliable  as  could  be  obtained  from  two  or  three 
carefully  conducted  experiments. 

The  proof  of  the  Normal  Error  Law  has  been  based  on  Hagen's 
hypotheses  regarding  errors  of  observation.  In  most  of  the 
problems  of  Astronomy,  Geodetics,  and  Physics  the  errors  of 
observation  satisfy  the  hypotheses,  and  the  application  of  least 
square  methods  is  j  ustified.  But  cases  may  arise  in  which  particular 
care  is  necessary  in  applying  these  methods.  This  is  especially 
true  of  Biological  problems.  For  organic  variability  is  the  resultant 
of  a  large  number  of  contributory  causes,  some  of  which  may  have 
a  definite  tendency  to  act  always  in  one  direction.  The  effect  of 
such  a  bias  is  to  produce  an  unsymmetrical  frequency  distribution, 
and  the  application  of  ordinary  least  square  methods  is  then 
meaningless.  It  is  thus  in  no  way  justifiable  to  regard  Least 
Squares  as  a  magical  instrument  applicable  to  all  problems. 

Of  the  arrangement  of  this  volume  little  need  be  said.  The 
discussion  of  methods  applicable  to  problems  involving  only  one 
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unknown  quantity  Avill  be  found  in  the  first  four  Chapters.  Many 
of  the  problems  of  the  physicist  involve  one  unknown  only,  and  the 
first  four  Chapters  contain  all  the  theory  that  has  to  be  considered 
in  the  discussion  of  such  problems.  The  subject  of  Chapter  VII, 
the  adjustment  of  conditioned  observations,  has  only  been  outlined 
very  briefly.  The  fuller  development  of  the  subject,  which  forms 
the  basis  of  the  adjustment  of  triangulations,  will  be  found  in  the 
works  to  which  reference  is  made  at  the  end  of  Chapter  VII.  The 
last  four  Chapters  can  only  be  regarded  as  mere  introductions  to 
the  subjects  discussed,  but  it  was  thought  that  their  inclusion  in 
a  textbook  on  Least  Squares  would  be  an  advantage. 

I  have  to  acknowledge  my  indebtedness  to  Mr  F.  J.  M.  Stratton, 
of  Gonville  and  Caius  College,  Cambridge,  to  whose  University 
lectures  I  owe  most  of  my  knowledge  of  the  subjects  discussed  in 
this  book,  and  upon  whose  notes  I  have  drawn  freely.  I  have 
also  to  thank  Professor  Eddington  for  many  useful  suggestions 
made  while  the  book  was  in  manuscript  form,  and  for  permission 
to  extract  from  his  lecture  notes  a  number  of  interesting  examples 
as  well  as  some  portions  of  the  theoretical  treatment. 

I  cannot  express  adequately  my  debt  to  the  Cambridge 
University  Press,  for  the  extreme  care  shown  in  passing  the  book 
through  the  proof  stage.  Owing  to  my  being  abroad  at  the  time, 
I  was  not  able  to  devote  as  much  time  as  was  desirable  to  the 
reading  of  proofs,  and  but  for  the  unfailing  vigilance  of  the  Press, 
many  errors  would  have  been  allowed  to  pass  into  the  text. 

DAVID   BRUNT. 


Meteorological  Section,  R.E. 
G.  H.  Q. 
November  23,  1916. 
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CHAPTER   I 

ERRORS    OF    OBSERVATION 

1.  The  first  serious  step  in  the  advance  of  the  sciences  which 
are  dependent  upon  measurements  of  any  kind  came  with  the 
introduction  of  instruments.  Even  the  most  primitive  instru- 
ments yielded  results  far  in  advance  of  those  obtained  by  mere 
estimation  with  the  unaided  human  senses.  With  the  course  of 
time  the  development  of  new  instruments  has  always  been  in  the 
direction  of  greater  refinement  and  accuracy.  But  even  the  most 
refined  instruments  often  fail  to  yield  absolutely  accurate  results ; 
for  as  a  rule  it  is  found  that  when  a  series  of  measurements  of  the 
same  quantity  is  taken,  the  results  do  not  show  perfect  agree- 
ment. The  disagreement  between  individual  observations  in  a 
series  is  attributed  to  errors  of  observation.  Any  observation 
which  is  of  the  nature  of  a  measurement  is  affected  by  three 
factors,  the  instrument  used,  the  external  conditions  at  the  time 
of  observation,  and  the  observer.  Each  of  these  factors  may 
introduce  errors  into  the  observation.  We  shall  consider  these 
three  factors  in  turn. 

2.  An  instrument  may  be  defined  as  a  mechanical  Qieans  of 
extending  the  ordinary  human  faculties,  so  as  to  yield  measure- 
ments of  greater  refinement  than  are  possible  without  its  aid. 
The  amount  of  water  in  a  cup,  the  weight  of  a  stone,  the  angular 
distance  between  two  distant  points,  can  be  roughly  determined 
by  estimation,  but  measurements  of  far  greater  accuracy  can  be 
obtained  by  pouring  the  water  into  a  graduated  vessel,  weighing 
the  stone  on  a  balance,  and  measuring  the  angular  distance 
between  the  distant  points  with  a  theodolite.  The  graduated 
measure,  the  balance,  and  the  theodolite  are  examples  of  in- 
struments. 

B.  o.  1 


2  ERRORS   OF   OBSERVATION  [CH. 

Just  as  the  graduated  vessel  is  liable  to  errors  of  graduation, 
so  the  most  carefully  made  instrument  is  liable  to  errors  of  con- 
struction, and  these  errors  affect  the  observations  made  with  the 
instrument,  producing  what  are  known  as  "  instrumental  errors." 
The  errors  of  construction  of  an  instrument  may  be  of  the  nature 
of  errors  of  graduation  of  a  scale,  periodic  errors  in  a  screw,  or 
maladjustment  of  the  separate  parts  of  the  instrument.  In  order 
to  obtain  the  best  possible  results,  it  is  necessary  to  make  a 
careful  search  for  errors  in  construction,  or  in  actual  working,  of 
the  instrument,  so  that  their  effect  may  be  eliminated.  The 
errors  may  be  of  such  a  nature  that  it  is  possible  to  calculate 
their  effect,  and  make  an  empirical  correction.  But,  when  possible, 
it  is  better  to  arrange  the  system  of  observations  in  such  a  way 
that  the  instrumental  errors  eliminate  themselves.  For  example, 
if  measurements  have  to  be  made  by  means  of  a  scale  graduated 
around  a  circle,  it  is  possible  to  eliminate  most  of  the  errors  due 
to  eccentricity  of  the  circle,  as  well  as  the  greater  part  of  the 
effect  of  periodic  errors  in  the  graduation  of  the  scale,  by  placing 
an  even  number  of  verniers,  say  four  or  six,  at  equal  distances 
around  the  circle*.  When  it  is  not  possible  to  eliminate  the 
instrumental  errors  by  the  adoption  of  a  suitable  method  of 
observation,  it  may  be  necessary  to  carry  out  a  special  series  of 
observations  for  the  purpose  of  calculating  empirical  corrections 
for  the  measurements  yielded  by  the  instrument. 

The  use  of  refined  instruments  does  not  always  diminish  the 
difficulties  of  observation.  "The  more  refined  the  methods  em- 
ployed, the  more  vague  and  elusive  does  the  supposed  magnitude 
become ;  the  judgment  flickers  and  wavers,  until  at  last  in  a  sort 
of  despair  some  result  is  put  down,  not  in  the  belief  that  it  is 
exact,  but  with  the  feeling  that  it  is  the  best  we  can  make  of  the 
matterf." 

3.  The  external  conditions  are  such  parts  of  the  environ- 
ment of  the  observer  as  affect  either  the  observer  or  his  instru- 
ment, and  are  beyond  the  observer's  control;  e.g.  temperature, 
wind,  or  sunlight.  If  the  external  conditions  be  subject  to  violent 
change,  it  may  be  necessary  to  suspend  work  for  a  time.     In  some 

*  See  Ball,  Spherical  Astronomy,  p.  462. 

t  Lamb,  Presidential  Address  to  Brit.  Assoc.  1904. 
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cases  it  is  possible  to  make  corrections  for  the  errors  introduced 
by  changing  external  conditions  by  means  of  an  empirical  law 
determined  by  an  independent  set  of  observations.  Such  correc- 
tions are  never  perfectly  satisfactory,  and  the  observer  should 
avoid  the  necessity  as  far  as  possible,  by  observing  only  at  times 
when  the  external  conditions  are  steady. 

4.  The  observer  may  have  personal  peculiarities  which  will 
affect  all  his  observations.  He  may  always  measure  an  angle  as 
larger  (or  smaller)  than  it  really  is,  or  he  may  always  tend  to  note 
the  passage  of  a  star  across  the  wires  in  the  focal  plane  of  an 
instrument  a  slight  interval  before  (or  after)  the  true  time  of 
transit.  A  careful  and  experienced  observer  appears  to  commit 
an  error  which  is  generally  of  the  same  sign,  and  approximately 
of  the  same  magnitude,  in  a  series  of  similar  observations.  Such 
an  error  is  called  the  "  personal  equation "  of  the  observer.  It 
can  be  corrected  by  comparison  with  a  fixed  arbitrary  standard. 
It  should  be  noted,  however,  that  only  experienced  observers  have 
a  well-defined  personal  equation.  An  inexperienced  observer  will 
commit  errors  of  varying  magnitude  and  sign,  and  even  an 
experienced  observer  ceases  to  have  a  personal  equation  when  not 
in  his  normal  state.  In  order  to  obtain  the  best  possible  results, 
an  observer  should  not  continue  to  work  when  he  is  tired. 

From  these  three  sources  there  will  arise  errors  of  a  more  or 
less  systematic  nature,  varying  according  to  definite  laws  with  the 
changing  conditions  of  the  observations.  When  the  observed 
value  has  been  corrected  for  these  sources  of  error,  we  might 
expect  the  corrected  observation  to  yield  the  true  value  of  the 
quantity  to  be  measured.  But  if  a  series  of  observations  be 
made,  and  corrected  in  each  case  for  the  errors  due  to  the  three 
factors  considered  above,  it  will  in  general  be  found  that  the 
corrected  measurements  differ  among  themselves.  These  indi- 
vidual differences  are  ascribed  to  a  fourth  class  of  error,  knOwn  as 
the  accidental  error. 

5.  Accidental  errors  are  due  to  no  known  cause  of  syste- 
matic or  constant  error.  They  are  irregular,  and  more  or  less 
unavoidable.  The  term  "accidental"  is  not  used  here  in  its 
ordinary  significance  of  "chance."    Strictly  speaking,  an  observation 

1—2 
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of  any  kind  is  affected  by  the  state  of  the  whole  universe  at 
the  time  of  observation.  But  as  an  observer  cannot  take  account 
of  the  whole  universe  and  its  changes  of  condition  during  the 
time  occupied  by  his  observations,  he  has  to  limit  his  attention  to 
those  operative  causes  which  he  regards  as  affecting  his  observa- 
tions in  a  measurable  degree ;  i.e.  he  limits  his  attention  to  the 
"essential  conditions."  If  an  observation  could  be  repeated  a 
number  of  times,  and  corrected  in  each  case  for  changes  in  the 
essential  conditions,  the  results  of  all  the  observations  should  be 
identical.  But  in  practice  it  is  found  that  the  individual  observa- 
tions in  a  series  differ  among  themselves.  These  differences  may 
be  ascribed  to  the  fact  that  the  so-called  "  essential  conditions  " 
do  not  include  all  the  effective  operative  causes.  There  will  be 
other  operative  causes  of  error,  whose  laws  of  action  are  unknown, 
or  too  complex  to  be  investigated.  These  causes  will  introduce 
errors  which  will  appear  to  the  observer  to  be  accidental. 

We  can  now  define  accidental  errors  as  errors  whose  causes 
and  laws  of  action  are  unknown.  The  total  accidental  error  in 
any  individual  measurement  may  be  the  sum  of  a  number  of  small 
accidental  errors  arising  from  different  causes.  Among  such 
errors  would  be  accounted  those  arising  from  slight  irregular 
changes  in  the  external  conditions,  such  as  the  vibration  of  the 
image  of  a  distant  object  on  account  of  air-currents,  and  the 
uncertainty  of  placing  a  cross-wire  upon  the  image  of  a  scale 
division;  and  also  irregular  changes  in  the  personal  equation  of 
the  observer.  There  will  also  be  included  in  this  class  the  rem- 
nants of  instrumental  errors,  but  if  it  should  be  possible  to 
discover  the  law  of  action  of  any  such  error,  it  is  thereby  removed 
from  the  class  of  accidental  errors  to  the  class  of  systematic 
errors.  Thus,  when  a  distance  is  measured  a  large  number  of 
times  with  different  parts  of  a  scale,  the  errors  of  the  scale- 
division  enter  into  the  results  in  a  more  or  less  accidental  manner. 
But  if  the  scale  errors  be  carefully  investigated,  their  effect  can 
be  eliminated  from  the  observed  values.  Carelessness  in  the 
handling  of  an  instrument  may  introduce  irregular  instrumental 
errors  which  fall  into  the  class  of  accidental  errors. 

It  is  thus  seen  that  even  when  all  the  systematic  errors 
traceable   to    the    instrument,    the    external    conditions,   or   the 
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observer,  have  been  corrected,  no  observation  can  be  regarded  as 
perfect.  It  will  miss  perfection  on  account  of  the  presence  of 
accidental  errors.  The  effect  of  the  accidental  errors  will  differ 
for  different  observations  in  the  same  series.  It  is  thus  impossible 
to  attain  certainty  in  the  result  of  an  observation.  In  practice  a 
series  of  observations  is  made,  in  the  hope  that  the  discussion  of 
the  series  will  eliminate  the  effect  of  the  accidental  errors.  The 
problem  which  we  have  to  solve  is  that  of  deciding  the  best 
method  of  conducting  this  discussion.  Our  problem  may  be 
briefly  stated  as  follows.  Given  a  series  of  observations,  each  of 
which  has  been  made  with  all  possible  care,  and  corrected  for  all 
known  causes  of  error,  how  shall  we  determine  the  most  probable 
values  of  the  quantities  to  be  determined  ?  The  matey^ial  with 
which  the  theory  deals  is  supposed  to  have  been  cleared  of  all 
constant  and  systematic  errors,  and  to  he  subject  only  to  accidental 
errors,  whose  laws  of  action  are  unknown.  The  values  of  the  ob- 
servations, thus  corrected,  so  as  to  be  subject  only  to  accidental 
errors  will  in  future  be  referred  to  as  the  "  observed  values." 

In  what  follows,  accidental  errors  will  be  regarded  as  obeying 
the  following  laws : 

(i)  A  large  number  of  very  small  accidental  errors  are  present 
in  any  observation. 

(ii)  A  positive  error  and  an  equal  negative  error  are  equally 
probable. 

(iii)  The  total  error  cannot  exceed  a  certain  reasonably  small 
amount. 

(iv)  The  probability  of  a  small  error  is  greater  than  the 
probability  of  a  large  error. 

As  we  shall  frequently  have  to  refer  to  constant  and  systematic 
errors  in  the  sequel,  it  will  be  well  to  have  a  clear  conception  of 
the  meaning  of  these  terms.  A  constant  error  is  one  which 
has  the  same  effect  upon  all  the  observations  in  a  series.  It  has 
the  same  magnitude  and  sign  in  all  the  observations.  A  syste- 
matic error  is  one  whose  sign  and  magnitude  bear  a  fixed  relation 
to  one  or  more  of  the  conditions  of  observation.  It  should  be 
noted  that  neither  of  these  types  of  error  fulfils  the  laws  of  acci- 
dental errors  given  above. 
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6.     Frequency  Curves. 

Before  proceeding  further  with  the  theoretical  discussion  of 
errors,  we  shall  consider  briefly  the  general  nature  of  the  material 
with  which  the  theory  has  to  deal.  The  table  given  below  shows 
the  results  of  a  series  of  observations  extracted  from  the  account 
of  the  preliminary  experiments  on  photographic  transits  at  the 
Observatory  of  Tokio.  The  first  and  third  columns  give  the 
actual  observations,  while  the  second  and  fourth  columns  give  the 
deviations  of  the  individual  observations  from  the  arithmetic 
mean  (4*986)  in  units  of  the  third  decimal  place*. 


4-974 

-12 

4-978 

-8 

82 

-4 

93 

7 

78 

-8 

88 

2 

89 

3 

83 

-3 

93 

7 

5-001 

15 

79 

—  7 

5-015 

29 

84 

-2 

4-993 

7 

87 

1 

91 

5 

5-001 

15 

74 

-12 

4-997 

9 

71 

-15 

86 

0 

67 

-19 

78 

-8 

91 

5 

83 

-3 

88 

2 

83 

-3 

84 

-2 

90 

4 

72 

-14 

91 

5 

72 

-14 

The  deviations  from  the  arithmetic  mean  vary  from  —  19  to 
+  29. 

These  deviations  (or  the  actual  observations)  may  be  repre- 
sented graphically  as  follows  (fig.  1).  Let  the  deviations  be 
measured  along  the  horizontal  axis,  and  the  number  of  observations 
along  the  vertical  axis.  Divide  the  total  range  of  deviation  into 
a  number  of  intervals,  say  0  to  ±  5,  +  5  to  ±10,  etc.  For  each 
observation  put  a  dot  along  the  ordinate  through  the  middle 
point  of  the  interval  within  which  it  falls,  successive  dots  on  the 
same  ordinate  being  placed  at  unit  distance  apart.  The  height  of 
the  last  dot  on  any  ordinate  gives  the  number  of  observations 

*  These  deviations  are  called  the  residuals. 
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which  fall  within  the  corresponding  deviation  interval.  If  the 
tops  of  all  the  ordinates  be  joined,  the  resulting  broken  line 
represents  the  frequency  of  the  different  measurements,  and  is 
called  the  frequency  curve  (see  figure  1). 
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Curves  obtained  by  this  method,  for  a  moderate  number  of 
observations,  will  generally  show  one  or  more  peaks,  some  hori- 
zontal portions,  and  may  at  some  points  lie  along  the  horizontal 
axis.  Similar  curves  will  be  found  in  figures  5,  6,  7,  and  8.  An 
alternative  method  of  completing  the  diagram,  instead  of  joining 
the  tops  of  successive  ordinates,  is  to  draw  a  rectangle  whose 
height  is  equal  to  an  ordinate,  and  whose  breadth  extends  over 
the  class-interval,  as  shown  in  figure  2.  Such  a  diagram  is  called 
a  histogram,  or  a  frequency-polygon. 


-30     -25     -20     -15      -10      -5 


0  5         10        15        20       25       30        35 

Fig.  2. 
Histogram  of  material  shown  in  figure  1. 


If  we  compare  the  frequency  curves  obtained  by  plotting  the 
repetitions  of  an  observation,  using  at  first  a  small  number  of 
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repetitions,  and  then  larger  and  larger  numbers,  we  find  that  for  a 
small  number  of  observations  the  curve  shows  peaks  and  valleys 
and  horizontal  portions,  and  changes  its  form  considerably  when 
the  number  of  observations  is  changed.  But  as  the  number  of 
observations  becomes  larger,  the  frequency  curve  tends  to  approach 
a  fixed  form,  having  in  general  a  single  maximum  ordinate,  while 
the  curve  slopes  down  towards  the  axis  on  each  side  of  the 
maximum.  It  then  becomes  convenient  to  draw  a  smooth  curve 
through  the  tops  of  the  ordinates.  As  the  number  of  observations 
is  still  further  increased,  the  successive  frequency  curves  become 
more  and  more  similar  in  form,  until,  when  the  number  of 
observations  is  very  great,  there  is  no  appreciable  difference  in 
the  form  of  successive  frequency  curves.  The  final  form  of  the 
curve  is  that  which  represents  the  frequency  distribution  of  an 
indefinitely  large  number  of  observations.  This  final  curve  is 
called  the  "curve  of  presumptive  errors." 

In  practice,  it  is  never  possible  to  repeat  an  observation  an 
indefinitely  large  number  of  times,  and  we  have  to  content 
ourselves  with  regarding  the  frequency  curve  obtained  from  a 
finite  number  of  observations  as  yielding  a  reasonably  good 
approximation  to  the  curve  of  presumptive  errors. 

The  study  of  a  large  number  of  curves  of  presumptive  errors 
shows  a  decided  similarity  in  their  form,  and  a  strong  tendency  to 
approach  a  typical  form  distinguished  by  symmetry  about  the 
maximum  ordinate.  The  approach  to  the  typical  form  is  so 
striking  that  it  is  a  matter  of  extreme  importance  to  investigate 
the  possible  analytical  form  of  the  curve.  In  the  next  chapter, 
starting  from  certain  hypotheses,  we  shall  deduce  an  analytical 
formula  to  represent  the  typical  curve.  The  purpose  of  the 
formula  is  to  express  the  proportion  of  the  total  number  of 
observations  whose  errors  shall  lie  between  any  assigned  limits, 
say  A  and  A  +  (iA ;  in  other  words,  it  will  express  the  probability 
that  the  error  of  a  single  observation  shall  lie  between  A  and 
A  +  dA.  The  use  of  the  words  "  frequency  "  and  "  probability  "  to 
denote  the  same  thing  is  common  to  writers  on  this  subject.  A 
moment's  consideration  of  the  definition  of  probability  will  indicate 
a  justification  of  this  custom. 

"  If,  on  taking  a  large  number  iV  out  of  a  series  of  cases  in 


I]  ERRORS  OF  OBSERVATION  9 

which  an  event  A  is  in  question,  A  happens  on  _piV  occasions,  the 
probability  of  the  event  A  is  said  to  be  p." 

The  quantity  p,  so  defined,  is  also  the  relative  frequency  of  the 
event  A. 

The  law  which  is  commonly  held  to  represent  the  typical  curve 
of  errors  is  Gauss's  Error  Law,  or  the  Law  of  Least  Squares, 
according  to  which  the  probability  that  an  observation  should 

have  an  error  between  A  and  A  +  dA  is  —r^e'^^'^'dA,  where  h  is  a. 

Vtt 

constant  depending  on  the  closeness  of  the  agreement  between 
the  observations  in  the  series.  This  expression  will  be  derived  in 
the  next  chapter,  and  its  validity  will  be  tested  by  its  application 
to  the  adjustment  of  a  number  of  series  of  observations.  Meanwhile 
it  may  be  noted  that  a  number  of  useful  results  can  be  derived 
from  the  assumption  that  positive  and  negative  errors  are  equally 
probable.  Thus  the  accuracy  of  the  arithmetic  mean  can  be 
investigated  without  reference  to  the  actual  form  of  the  Law  of 
Errors. 

7.     The  Accuracy  of  the  Arithmetic  Mean. 
If  mi,  ma,  ...,  TUn  be  n  determinations  of  a  single  unknown 
quantity,  the  arithmetic  mean  is  given  by 

1/ 

a  =  -  {nil  +  ^/^2  +  •  •  •  +  w^n). 

If  2/i'  2/2  5  •..,  2/n  ^^  the  errors  of  the  individual  determinations, 
and  X  the  error  of  the  arithmetic  mean,  then 

It 

Squaring  each  side  of  this  equation,  we  find 

1  2 

^  =  -2  (Vi  +  ys'  +  . . .  +  2/n-)  +  -2  (2/12/2  +  2/12/3  +  2/2^3  +  etc.). 

The  mean  value  of  x^,  say  x^,  is  equal  to  the  mean  value  of  the 
right-hand  side  of  this  equation.  Since  positive  and  negative 
errors  are  equally  likely  to  occur,  and  all  the  ys  are  independent, 
then  in  a  large  number  of  trials  the  mean  value  of  3/,.?/,  will  be 
zero.    And  since  all  the  n  observations  considered  are  supposed  to 
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be  carried  out  in  precisely  the  same  manner,  we  may  expect  the 
mean  values  of  y^,  yi,  ...,  y^^  to  be  equal.  If  the  mean  value  of 
each  of  the  squared  terms  y^,  yi,  etc.,  be  //-^  then 


and 


X"  =  —  (7lfM^)  =  — 


Hence  it  follows  that  the  accuracy  of  the  arithmetic  mean  is  Jn 
times  the  accuracy  of  a  single  observation ;  a  result  of  funda- 
mental importance  in  the  theory  of  errors. 


CHAPTER   II 


THE    LAW    OF    ERROR 


8.     Hagen^s  Proof  of  Gauss's  Error  Law*. 

Hagen  based  his  proof  on  the  assumption  that  an  accidental 
error  consists  of  the  algebraic  sum  of  a  very  large  number  of 
infinitesimal  errors,  all  of  equal  magnitude,  and  as  likely  to  be 
positive  as  negative.  It  has  already  been  suggested  that  the 
accidental  error  occurring  in  any  one  observation  may  be  composed 
of  a  number  of  errors  due  to  slight  changes  in  the  external 
conditions,  remnants  of  instrumental  errors,  or  irregularities  in 
the  individual  peculiarities  of  the  observer.  Each  of  these  com- 
ponents may  in  turn  be  regarded  as  due  to  a  large  number  of 
elementary  causes.  And  so  Hagen's  hypothesis  is  in  no  way  a 
violation  of  our  knowledge  of  the  nature  of  accidental  errors. 

Let  the  total  number  of  elementary  errors  be  n,  where  n  is  a 
number  to  which  we  can  assign  no  limit.  If  the  magnitude  of 
each  of  the  elemental  errors  be  e,  then  it  will  be  possible  for  an 
error  as  great  as  ne  to  occur,  in  the  extreme  case  where  all  the 
small  errors  occur  with  the  same  sign.  An  error  (n  —  2m)  e  will 
occur  when  n  —  m  of  the  elemental  errors  occur  with  a  positive 
sign,  and  the  remaining  7n  with  a  negative  sign.  The  number  of 
ways  in  which  this  can  happen  is  simply  the  number  of  ways  in 
which  we  can  select  m  out  of  the  n  errors.  The  selected  m  errors 
will  have  one  sign,  and  the  remaining  n  —  m  will  have  the  opposite 
sign.     The  selection  can  be  made  in 

n\ 

; .  ways. 

n  —  m\  ml 

*  Hagen,  Griindzuge  der  WahrscheinUchkeitsrechming  (Berlin,  1837). 
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Let  ^  —  e  be  the  error  due  to  ?i  —  m  errors  +  e,  and  m  errors  —  e. 
Then  x  —  e  =  {n  —  2m)  e 

or  x={n  —  2m+V)e. 

If  /(^  —  e)  be  the  frequency  of  the  error  x  —  e, 

n  ' 
^  ^  71  — ml  ml 

An  error  x  +  €,  or  (n  —  2m  +  2)  e,  will  be   formed  by  n  —  m  +  1 
elemental  errors  +  e,  and  m  —  1  errors  —  e.     Hence 

/(^  +  6)  =  ^_^_^^,  m-ir 

It  follows  that 

f(x-\-6)_         m 


f(x  —  e)      n  —  m-\-l' 

Since  the  elemental  errors  e  are  supposed  to  be  infinitesimally 
small  in  comparison  with  the  finite  composite  errors  of  actual 
observations,  the  latter  may  be  assumed  to  vary  continuously  from 
—  ne  to  +  716,  the  extreme  error  +  ne,  whose  relative  frequency  is 
very  small,  being  assumed  to  be  infinite.  If  we  define  (/>  (x)  dx  as 
the  proportion  of  errors  between  x  —  ^dx  and  x  -{-^  dx,  (j)  (x)  may 
be  regarded  as  a  continuous  function.  But  the  function /(a?)  gives 
the  frequency  of  the  errors  x,  where  all  the  possible  values  of  x  are 
separated  by  intervals  of  26.  We  may  bring  the  function  (f)  (x) 
and  f(x)  into  line  by  regarding  f(x)  as  the  frequency  of  all  errors 
between  x  —  e  and  x  ■\-  e.     We  then  have  the  equation 

f(x)=Gct>(x).2e. 
It  follows  that 

(t)(x  +  e)  _f(x  +  e)  _         m 
<f>(x  —  e)     f{x  —  e)      n  —  m  +  1' 

<f>  {x  •\- e)  —  (^  {x  —  e)  _     (n  — 2m  +  l) 
<i>  {x  +  e)  +  (f>  {x  —  e)  n-\-l 

Neglecting  squares  of  e,  we  may  write 

<^(^  +  e)  =  (/)(^)  +  e^^, 

4>{x-e)  =  4>{x)-e-^. 
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Substituting  these  values  in  the  above  equation,  we  find 

71  —  2m  +  1  X 


n  +  1  (/i  +  l)e' 

1   d6               X 
or  —  — —  = 

Integrating  this  equation,  we  find 

log  c/)(^)  =  -^^^^\^^,+ const. 

=  -  h?-x^  +  const. 

where  }i^  =  7-7 — -„ , 

The  constant  'h?  depends  upon  the  nature  of  the  errors  entering 
into  the  observations.  The  value  of  the  constant  A  is  easily 
derived.  ^{x).dx  gives  the  proportion  of  errors  between  the 
limits  x  —  \dx  and  x-\-\dx^  or,  practically,  between  the  limits 
X  and  x^dx.  Since  all  the  possible  errors  must  lie  between 
—  00    and  +  00  ,  it  follows  that 


^{x)dx=\=^A\      e-^-'^'dx 

J  —00  J  —00 

e-^''^'dx 


=2A  r 

Jo 


Finally,  we  may  write 


<l>(x)  =  -^e-^''i'dx. 

Vtt 


This  is  the  functional  form  of  Gauss's  Error  Law,  or,  as  it 
is  sometimes  called,  the  Normal  Error  Law.  Its  interpretation  is 
that  the  relative  number  of  observations  in  a  series,  whose  errors 
lie  within  the  limits  x  and  x  +  dx,  is 

-L  e-^'^^'dx. 

Vtt 
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In  a  long  series   of  observations,  ?i   in  number,  the  number  of 
observations  whose  errors  lie  between  x  and  x-\-dx  is 

nil       ,.-,  ,  , 
—=.  e-^'""- dx. 

V'TT 

9.     Thomson  and  Tait's  Proof  of  the  Error  Law. 

Consider  the  distribution  of  shots  fired  at  a  target.  Let  axes 
of  coordinates  be  drawn  through  the  centre  of  the  target,  the 
^-axis  horizontal,  the  y-axis  vertical.  Let  F,  (x,  y),  be  the  mark 
of  one  shot  on  the  target.  Then  x  and  y  are  components  of  the 
error  of  placing  the  point  P.  Since  the  shot  is  as  likely  to  go  to 
the  right  as  to  the  left  of  the  centre  of  the  target,  the  probability 
of  an  error  between  x  and  x-\-dx  in  the  ^-coordinate  is  of  the  form 
<f>  (x^) .  dx.  Similarly  the  probability  of  an  error  between  y  and 
y  +  dy  in  the  other  coordinate  is  of  the  form  cf)'  (y^) .  dy.  We 
shall  assume  that  the  functions  </>  and  0'  are  of  the  same  form. 
This  is  equivalent  to  assuming  that  a  large  number  of  shots  fired 
at  the  target  would  be  distributed  indiscriminately  about  the 
centre,  showing  no  special  symmetry  of  distribution  about  the 
horizontal  and  vertical  axes,  as  opposed  to  any  other  axis  through 
the  centre  of  the  target. 

Then  the  probability  that  the  point  P  should  have  coordinates 
lying  between  x  and  x  +  dx,  and  between  y  and  y  +  dy,  respectively, 
will  be 

(^{x'').<^i:f~).dx.dy. 

In  other  words,  the  probability  that  the  point  P  should  lie  within 
a  given  small  region  of  area  dA  is 

ct>{af).4>{f-).dA, 

X  and  y  being  the  coordinates  of  any  point  within  the  area  dA. 

But  if  another  pair  of  axes  Ox,  Oy'  be  drawn  through  the 
centre  of  the  target,  so  that  Ox'  passes  through  the  area  dA,  the 
probability  that  a  single  shot  should  be  placed  within  the  area  dA 
can  also  be  expressed  by 

^{x''),^{y"^)dA, 

or  (^{x'-\-y-).^{^)dA. 
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This  is  a  functional  equation  whose  solution  is 

Since  small  errors  are  more  probable  than  large  errors,  ^  (x^)  must 
decrease  as  x  increases,  and  k  must  therefore  be  negative.  Putting 
k  =  —  h-,  we  can  show  as  before  that 

Vtt 
Finally  we  may  write 

Vtt 
which  agrees  with  the  form  of  the  error  law  derived  above  (§  8). 

10.     A  Generalised  Form  of  Hagen's  Proof*. 

Suppose  that  the  error  of  observation  is  made  up  of  a  large 
number  of  independent  elementary  errors,  and  that  the  probability 
that  any  one  of  these  elementary  errors  shall  lie  between  e  and 
€  +  de  is  g  (e)  de.  We  shall  assume  that  positive  and  negative 
errors  are  equally  likely,  so  that  g  (e)  is  an  even  function,  but 
otherwise  g  (e)  may  be  regarded  as  being  quite  arbitrary.     Then 

r  +  cc  ^ 

g{€)de=l, 

.    -oc 

r^\g{e)d6=.0.    y    (1). 

J  —oc 
r+co 

Further  let  e-  g  (e)  de  =  k-. 

The  quantity  k  so  defined  may  be  called  the  mean  square 
elementary  error. 

Let  /(«,  ^).  dx 

be  the  probability  that  the  resultant  error  due  to  n  of  the 
elementary  errors  lies  between  x  and  x-\-  dx.     Then  we  have 

r+co 

/(?i  +  1,  ^)  =         fill,  x-e).g(€) de. 

J   -cc 

This  equation  expresses  the  fact  that  the  probability  that  n  +  1 

errors  add  up  to  x,  is  made  up  of  the  probability  that  the  first 

*  The  above  proof  is  due  to  Professor  Eddington. 
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n  errors  add  up  to  a;  —  e,  multiplied  by  the  probability  that  the 
remaining  error  is  e,  summed  for  all  possible  values  of  e. 
Expand  both  sides  of  the  equation  by  Taylor's  Theorem  *. 

/(^.  ^)  +  3;^•/(^>^)+••• 
"J    00  r  ^'''  ^^~^dx^^^'  ^^  ^^^  W^^''^'  ^)-  etcj  g  {e)de 
^f{n,  x)^w'^f{n,  ^)+  ...  by  (1). 

Write  nk"^  =  t.     Then  \/t  is  the  mean  square  error  of  the  observa- 
tions, and  remains  finite  when  n  is  taken  infinitely  great,  and  k^ 
accordingly  infinitely  small. 
We  then  have 

/+,.|+...=/+p.g+..., 

the  omitted  terms  being  of  the  order  of  k^^  when  k  is  small. 
Taking  now  a  very  large  number  of  very  small  elementary  errors, 
so  that  ^^  is  negligible  in  comparison  with  k"^,  this  gives 

It  is  easily  verified  that  a  solution  of  this  equation  is 

/=arie-|     (3). 

It  remains  to  show  that  this  is  the  solution  applicable  to  our 
problem.  The  differential  equation  (2)  is  the  same  as  that  which 
determines  the  conduction  of  heat  along  a  bar.  Now  when  the 
mean  square  error  of  an  observation  is  zero,  the  probability  of  an 
error  x  vanishes  except  for  «  =  0 ;  but  unit  probability  is  then 
concentrated  into  an  infinitesimal  range  at  x  =  0.  The  corre- 
sponding condition  in  the  heat  problem  is  that  initially  {t  =  0)  the 
bar  is  everywhere  at  zero  temperature  except  that  a  unit  quantity 
of  heat  is  concentrated  in  it  at  the  point  x  =  0.  Also,  whatever  be 
the  mean  square  error,  /  must  vanish  at  a;  =  -I-  oo  and  a;  =  —  oo  ; 
in  the  heat  problem  this  means  that  the  temperature  at  the  two 
infinitely  distant  ends  is  zero.     Now  it  is  known  from  the  theory 

*  Difficulties  as  to  the  possible  divergence  of  the  Taylor  expansion  for  large 
values  of  e  may  be  avoided  by  introducing  the  additional  assumption  that  ^(e)=0 
for  values  of  e  beyond  certain  limits. 
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of  the  conduction  of  heat  that  these  two  conditions — ihe  initial 
condition  and  the  end  condition — are  sufficient  to  determine 
uniquely  a  solution  of  the  equation.  Hence  if  we  can  obtain  a 
solution  for  /  satisfying  (1)  /=  0  when  ^  =  0,  for  all  values  of  x 
except  ^  =  0,  and  (2)  /=  0  when  x  —  +  cc,  and  ^  =  —  go  ,  for  all 
values  of  t,  this  will  be  the  only  possible  solution  for  the  error  law. 
It  is  easily  seen  that  the  solution  (3)  does  satisfy  these  conditions. 
The  constant  C  must  be  chosen  so  that 

f.da:  =  l, 

and  writing 

we  obtain  the  expression  in  the  usual  form, 

Vtt 

11.  The  proofs  of  the  Normal  Error  Law  given  above  are  based 
on  certain  definite  hypotheses  concerning  the  nature  of  accidental 
errors.  It  has  been  shown  that,  if  the  accidental  errors  to  which 
a  series  of  observations  is  liable  satisfy  these  hypotheses,  the  errors 
of  observation  will  be  distributed  according  to  the  normal  law. 
The  final  justification  of  the  use  of  Gauss's  Error  Curve  rests  upon 
the  fact  that  it  w^orks  well  in  practice,  and  yields  curves  which  in 
very  many  cases  agree  very  closely  with  the  observed  frequency 
curves.  The  normal  law  is  to  be  regarded  as  proved  hy  experi- 
ment, and  explained  by  Hagen's  hypothesis.  When  the  curve  of 
fi:'equency  of  the  actual  errors  is  not  of  the  form  of  the  normal 
curve,  we  may  safely  conclude  that  the  nature  of  the  accidental 
errors  concerned  is  not  in  accordance  with  Hagen's  hypothesis. 

The  normal  curve  has  applications  in  a  region  where  deviations 
from  a  mean  value  are  considered,  though  these  deviations  are 
not,  properly  speaking,  of  the  same  nature  as  accidental  errors  of 
observation ;  e.g.,  it  is  frequently  applied  in  biological  questions 
{vide  Ex.  3,  p.  42).  The  use  of  the  normal  curve  in  such  cases  is 
justified  only  when  the  differences  between  individual  cases  are 
produced  by  causes  whose  mode  of  action  is  in  accordance  with 
Hagen's  hypothesis. 

B.  o.  2 
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12.     The  Form  of  the  Error  Curve. 

The  equation  of  the  Error  Curve  is 

^      Vtt 
The  curve  is  symmetrical  about  the  axis  of  y.     The  maximum 

ordinate  occurs  at  ^  =  0,  and  has  the  value  -p:.     The  curve  can 

Vtt 

easily  be  constructed  by  the  use  of  tables  of  logarithms. 
Differentiating  this  equation  twice  we  obtain 

showing  that  the  curve  has  points  of  inflexion  at 

1 

The  form  of  the  curve  is  shown  in  figure  3. 

y 


Fig.  3. 
Gauss's  Error  Curve. 

The   probability  that    the   error  of  an  observation  shall   lie 
between  a  and  h  is 


rb  1      fhb 

-.     e-'^'^^'dx,    or     -^       e-^' dt 

TT  J  a  Vtt  J  ha 


The  value  of  this  integral  can  be  derived  from  the  table  below. 
The  table  gives  the  value  of  0  {t),  which  is  equal  to 

2    (p^ 


2    ^p^ 
-^       e-i'dt, 
vttJo 
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for  a  series  of  values  of  cc  from  0  to  5*0,  p  being  a  constant  whose 
value  is  0*47696,  the  reason  for  the  use  of  which  will  appear  later. 


Table  of 

0(0  =  ^      e- 

-^'dt 

X 

0(0 

X 

6(0 

0-0 

0-000 

2-6 

0^921 

0-1 

0-054 

2-7 

•931 

0-2 

•107 

2-8 

•941 

0-3 

-160 

2-9 

•950 

0-4 

-213 

3-0 

•957 

0-5 

-264 

3-1 

•963 

0-6 

•314 

3-2 

•969 

0-7 

•363 

3-3 

•974 

0-8 

•411 

3-4 

•978 

0-9 

•456 

3*5 

•982 

ro 

•500 

3-6 

•985 

1-1 

•542 

3-7 

•987 

1-2 

•582 

3-8 

•990 

1-3 

•619 

3-9 

•992 

1-4 

•655 

4-0 

•993 

1-5 

•688 

4-1 

•994 

1-6 

•719 

4-2 

•995 

1-7 

•748 

4-3 

•996 

1-8 

•775 

4-4 

•997 

1-9 

•800 

4-5 

•998 

2-0 

•823 

4-6 

•998 

2-1 

•843 

4-7 

•998 

2-2 

•862 

4-8 

•999 

2-3 

•879 

4-9 

•999 

2-4 

•895 

5^0 

•999 

2-5 

•908 

00 

1^000 

2     f^ 
A  table  of  values  of  — =  I  e~^''  dt  is  given  in  Appendix  II. 

For  small  values  of  x,  the  integral  \  e-""' dx  may  be  obtained 

J  0 

by  expanding  e' 


» 


e-^'  =  1  _  a'^  +  ^  _      \      ^  +  etc. 

This  series  is  uniformly  convergent,  and  may  therefore  be  integrated 
term  by  term. 

_,.2    7  ^  oc^  x^ 

o'       '^^  =  ^-lT3  +  2T5-3!T  +  ^*°- 

2—2 


Jo 
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When  X  is  not  small  the  above  series  converges  very  slowly, 
and  it  is  then  better  to  use  another  formula  obtained  by  integrating 
by  parts. 

2x  2  '  ^2 


/ 


1        3       1         2     1.3 

2x         ^2^         ^    2'      x" 


=  -  ?r-  e-"^"  +  j^.  e-^'  +  -^    -  e-^'  dx 


J  x" 


Continuing  the  process  we  find 

r  _^2  7       e-^'  f,        1        1.3       1.3.5 

1/      ^"=-2^f-2^^  +  (2^-(2^+^^^- 
and  since 

re-'^'dx  =  re-^'  dx  -  re-^'  dx  =  '^-  fe-^^  dx, 

Jo  Jo  Jx  ^  Jx 

we  can  apply  the  second  series  to  evaluate  I  e~^^  dx. 

Jo 
When  X  is  large  a  still  better  method  is  to  use  a  series  due  to 
Schlomilch  *. 

/: 


O  p-X^ 


^2x  {        2{x''+\)      2^{x''  +  l){a?  +  2) 
5  9 


2\x''  ■\-l){x''  +  'l){x^  +  Z)     2^(^2_^l)(^2^_2)(^2  +  3>)(^+4>^ 

129 

2^(^2  +  l)(^^  +  2)(^^  +  3)(^2  +  4)(^2  +  5)^"        • 

This  series  converges  very  rapidly  for  large  values  of  x. 

The  error  curve  drawn  above,  represented  by  the  function 

Vtt 
extends  to  infinity  along  the  axis  of  x,  in  both  directions,  having 
the  axis  of  x  as  an  asymptote.  In  a  system  of  errors  accurately 
represented  by  this  curve,  the  errors  are  continuous,  and  extend 
from  —  00  to  +  00 .  In  actual  observations  the  errors  are  never 
greater  than  a  reasonably  small  finite  limit,  and  the  curve  which 
would  accurately  represent  the  system  of  errors  should  meet  the 
^-axis  at  a  finite  distance  from  the  origin.     But  as  the  normal 

*  Kompendium  der  hohern  Analysis,  Bd.  ii,  p.  266. 
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curve  of  errors  shown  in  figure  3  rapidly  approaches  the  axis  of  x, 
so  that  its  ordinates  become  infinitesimally  small  at  a  short 
distance  from  the  origin,  no  considerable  error  is  introduced  by 
regarding  the  curve  of  actual  errors  as  approaching  the  axis  of  x 
asymptotically,  and  so  regarding  the  limits  of  possible  errors  of 
measurement  as  being  extended  to  +  x . 

13.     The  Arithmetic  Mean. 

If  a^i,  ^2 J  •••,  ^n  be  n  observed  values  of  a  quantity  x,  the 
errors  of  the  separate  observations  are  x^  —  x,  x^  —  x,  etc.  The 
probability  of  making  this  system  of  errors  is  proportional  to 

The  value  of  x  must  be  so  chosen  that  this  probability  shall 
be  a  maximum,  or  so  that  S  {xr  —  xy  shall  be  a  minimum. 
Differentiating  the  last  expression  we  obtain  an  equation  for  x, 

2  (Xr  —  x)  =  0,    or    I^Xr  —  UX  =  0. 

Hence  ^  =  -  2^^  =     (a^i  +  ^g  +  •  •  •  +  ^n) (!)• 

The  most  probable  value  of  the  unknown  is  therefore  the  arith- 
metic mean  of  the  observed  values. 

This  result  may  be  obtained  without  the  aid  of  the  differential 
calculus.  If  ^  be  the  arithmetic  mean  oi  x^,  x^,  ...■>  Xn,  and  x  any 
other  value  of  the  unknown, 

2  (Xr  -Xy-2  (Xr  -  xy  =  11  {x''  -  x')  -  2  (x  -  x)  ^X^ 

=  n  (x^  —  x^)  —  2nx  (x  —  x) 
=  n(x  —  xy. 

.'.    '2{Xr  — xy^  =  %{Xr  —  Xy-\-  n{x —  xy      (2). 

It  follows  that  S  (xr  —  xy  is  least  when  a;  =  ^,  or  when  x  coincides 
with  the  arithmetic  mean  of  the  observed  values. 

It  will  be  convenient  to  use  the  abbreviation  A.M.  to  denote 
the  arithmetic  mean. 

The  quantities  obtained  by  subtracting  the  A.M.  x  from  each 
of  the  observed  values  are  called  the  residuals,  and  are  usually 
denoted  by  v^  Vr,,  etc.     We  thus  have  a  set  of  n  equations, 

U/ J  —"  X  ^^^  V\ , 
*~'2  "^  X  ^~    t/2  5 

Xii        X  =  Vji . 
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Also     Vi  +  ^2  +  •  •  •  +  ^n  =  (^1  +  ^2  +  •  •  •  +  ^n)  —  nx=0  (3) 

The  name  of  the  method  of  "Least  Squares"  is  due  to  the 
fact  brought  out  by  equation  (2),  that  the  most  probable  value  of 
the  unknown  is  that  value  for  which  the  sum  of  the  squares  of  the 
residuals  is  least. 

In  actual  computation  it  is  necessary  to  have  numerical  checks 
upon  the  calculated  values  of  the  A.M.  and  the  sum  of  the  squares 
of  the  residuals.  If  x  be  the  A.M.  and  oc'  any  other  quantity, 
then 

x  =  o)  H ^— — -, 

n 

1  {Xr  -Xy=^-n  (X-X'Y  +  2  (Xr-CCy. 

A  calculation  from  some  second  base  x'  thus  affords  a  useful  check 
upon  the  value  of  the  mean  and  upon  the  sum  of  the  squares  of 
the  residuals.  Also,  in  some  cases,  it  may  be  more  convenient  to 
evaluate  2  (xr  —  x'y  than  S  {xr  —  xf. 

The  accuracy  of  the  A.M.  can  also  be  checked  by  means  of  the 
sum  of  the  residuals,  since  Xv  =  0. 

14.  Proof  of  the  Normal  Law  from  the  Principle  of 
the  Arithmetic  Mean. 

In  §  13  above  we  derived  the  principle  of  the  arithmetic  mean 
from  the  Normal  Law.  It  is  interesting  to  note  that  we  can 
invert  the  process,  and  derive  the  Normal  Law  from  the  principle 
of  the  arithmetic  mean. 

Let  the  probability  that  the  error  of  an  observation  should  lie 
between  A  and  A  +  dA  be  represented  by  (/>  (A)  dA.  Our  problem 
is  to  find  the  form  of  the  function  cj)  (A),  As  dA  tends  to  zero 
</)(A)c?A  also  tends  to  zero;  or  in  other  words,  the  probability  of 
making  an  error  of  the  exact  value  A  is  zero.  When  we  speak  of 
the  "  probability  of  an  error  A,"  we  shall  interpret  this  expression 
as  meaning  the  probability  of  an  error  between  A  —  a  and  A  +  a, 
where  a  is  a  small  quantity  which  is  just  inappreciable  in  the 
observation  in  question.  With  this  convention  we  may  say  that 
the  probability  of  an  error  A  is  Ccp  (A),  where  C  is  a  constant. 

Let  Xy,  x^,  ...,Xnhe  n  observed  values  of  an  unknown  quantity 
X.     If  X  be  the  assumed  value   of  the  unknown,  the   errors  of 
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the  separate  observations  will  be  Ai,  Ag,  ...,  A,^,  defined  by 

Aj  =  .Ti  —  X,     A2  =  0L\  —  X,  etc. 

The  probability  of  the  occurrence  of  an  error  A^  is  C(/)(Ay), 
s,nd  the  probability  of  the  occurrence  of  the  system  of  errors 
Ai,  A2,  ...,  A^  is 

The  assumed  value  of  the  unknown,  x,  must  be  the  most  probable 
value  of  the  unknown ;  in  other  words,  it  will  be  such  as  to  make 
the  system  of  errors  Aj,  A2,  etc.,  the  most  probable.  It  follows 
that  the  probability 

C"'.^(AO<^(A,)....^(A„) 
must  be  a  maximum  for  the  assumed  value  of  x.     Differentiating 
with  respect  to  x,  we  may  write  this  condition  in  the  form 

4>{^,)    dx'^cf>{A,)    dx'^'"^  <f>(A,,)  ~dx         ' 
where  f  (A,)  =  ^  (/>  (A,). 

But  since  Aj  =  ^1  —  x,     A2  =  ^2  —  ^,  etc., 

cZAi  _  dA_,  _         _  _ 

dx       dx  '  ' 

and  the  condition  for  a  maximum  may  be  written 

^^^-o C^ 

Equation  (1)  must  be  satisfied  for  the  most  probable  value  of  x. 

We  now  assume  that  this  value  of  x  is  the  A.M.     With  this 
assumption,  we  may  write  down  the  additional  equation 

tK  =  0    (2), 

which  is  simply  equation  (3)  of  §  13. 

Equations  (1)  and  (2)  must  be  simultaneously  satisfied  by  the 
same  value  of  the  unknown,  and  so  they  must  be  identical. 

.      0'(A,)    ^   4^'i^,)  ^       ^    4>'  (A,,) 
A,(/)(AO      A,ct>{A,)      '"     A,,</)(A„) 
=  constant  =  k  say. 


24  THE   LAW   OF   ERROR  [CH. 

yields  a  differential  equation  whose  solution  determines  the  form 
of  (j)  (A).     Integrating  the  equation,  we  find 

log  (f)  (A)  =  JA;A2  +  const.  I 

or  (/>(A)  =  ^e4^•^^  ' 

The  product  (/> (Aj)  cj) (As) ...  (f> (A^)  must  be  a  maximum  for  the 
assumed  value  of  x.  Hence  the  series  S  log  <^  (A^)  must  be  a 
maximum.     The  condition  to  be  satisfied  is  that 

-j—^  2  log  (j)  (A)  shall  be  negative, 

CLOO 

d^ 
or  2  -T-:^  (J^A^)  shall  be  negative. 

Since  A^  =  ^^  —  ^^ 

^A^  =  2 

and  2^,(iA;A,0='i.^. 

It  follows  that  k  must  be  negative. 
Putting  ^k  =  —  h^,  we  may  write 

The  value  of  ^  may  be  derived  as  in  §  8,  yielding 

VTT 

15.     The  Law  of  Error   of  a   linear  function  of  two 
independent  quantities  whose  laws  of  error  are  known. 

Let  77?i,  m.2  be  two  independent  observed  quantities,   obeying 
the  error  laws 

-^g-V^^   and    --^e~''2^^^  respectively. 

VTT  VTT 

The  probability  of  the  occurrence  of  an  error  between  oc^  and 
Xi  +  dx^  in  W]  is 

V7i 


Il]  THE   LAW   OF   ERROR  25 

and  the  probability  of  the  occurrence  of  an  error  between  Xc^  and 
a?2  +  dx^  in  mg  is 

h.e-'^^'^'Ux., 

Vtt 

Since  m^  and  m^  are  independent,  the  probability  of  the  simul- 
taneous occurrence  of  these  errors  in  m^  and  m^  is  the  product 
of  the  two  separate  probabilities.  Calling  this  probability  p  we 
have  the  equation 

p  —  ^— ?  e  ~  ^^'^^^ "  ^-^'^-^  dxidx2. 

If  the  linear  function  in  question  be 

ai7?ii  +  a^m^  =  F, 

the  corresponding  error  in  F  will  lie  between 

a^Xi  +  a2^2  =  ^  (say) 

and  tti  (x-i  +  dxj)  +  a^  (x^  +  dx^  =  x  +  dx. 

An  error  x  in  F  may  be  derived  from  any  error  x^  in  mi,  combined 
with  the  error  x.2  in  m.2 ;  where  x.,  is  fixed  by  the  relation 

^11     I     Ct'2'^2  ~~  '^^ 

Substituting  for  c7?2>  we  find 


p  =  -^-^  e  V    «2    /  dxidx. 

IT 
_  \^h.-?  ,  _  fli^V  +  a.2'^1'  /"        _  ai^/^2'  \ 


e    "r/'2'  +  «2-'V  V        V      ai^v+as"''!-  /  dx^dx.,. 

IT 

The  probability  of  the  occurrence  of  an  error  x  must  take 
into  account  the  fact  that  x  may  be  made  up  of  any  value  of  x^ 
between  —  x  and  +  00 ,  combined  with  the  corresponding  value 
of  X2  fixed  by  the  equation 

X  —  CtiXi  +  UqX'z. 

This  generality  is  obtained  by  integrating  the  function  p  with 
respect  to  x^,  between  the  limits  ±00,  afterwards  taking  into 
account  the  relation  dx^aAxo  which  holds  when  x^  has  been 
eliminated. 
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Let  (j)  (x)  dx  be  the  probability  of  the  occurrence  of  an  error 
between  x  and  x  +  dx.     Then,  as  above  suggested, 

Z'  +  OO 

(f>{x)dx=  \       p 

J     —CO 

^hlbacc^e    <>'^'^<'^'  e  <        V'   <k7-,a,^h,^V  dec, 

TT  j    -00 


^^c^^,.    W+«2^^r 


Taking  into  account  the  relation  dx  =  a2dx^,  we  find 

Vtt 
where  ri  = 


T/ii/s  ^Ae  ^aii;  of  error  of  F  is  of  the  same  form  as  the  latvs  of  error 
of  m^  and  m^,  and  the  parameter  h  is  fixed  by  the  last  equation. 
This  equation  may  be  written 

1  _ai-      tto^ 

h'~Y^'^h}' 
This  formula  is  capable  of  generalisation  for  a  linear  function  of 
any  number  of  independent  variables.     Thus  when 

F  =  a^n^  +  a2m2  +  a^m^, 

if  h,  hj,  h^,  hs  be  the  parameters  of  F,  m^,  m^,  m^,  respectively, 
and  h.2  the  parameter  of  the  error  law  of  the  function  aa^Tig  +  as^ng, 
we  have  the  relations 


1       ai     ai 

■  h^-hrhr 

and 

1      ai^  ,    1 
h^     h^^h^' 

Hence  we  may  write 

1       a-^     ai     ai 
h?     h,'  "^  h^'     h,' 
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In  the  same  way,  the  result  obtained  above  may  be  extended  in 
succession  to  4,  5,  or  any  other  number  of  independent  variables. 
The  final  result  may  be  stated  thus : 

If  i^  be  a  linear  function  of  n  independent  quantities,  which 
have  been  determined  by  observation,  the  function  F  follows  an 
error  law  which  is  of  the  same  form  as  the  error  laws  of  the 
independent  unknowns.     If  the  function  is 

F  =  aiVii  4-  a2??^2+  ...  +  cin'^hi, 
the  parameter  h  of  the  function  F  is  given  by 

16.     The  Median. 

The  median,  which  is  the  value  of  the  unknown  which  has  as 
many  observed  values  on  one  side  of  it  as  on  the  other,  is  the 
natural  competitor  of  the  arithmetic  mean,  and  it  is  interesting  to 
consider  the  law  of  error  which  follows  from  the  assumption  that 
the  median  is  the  most  probable  value  of  the  unknown. 

Let  X  be  the  median,  and  let  f{x  —  x')dx  be  the  probability 
of  making  an  observation  between  x  and  x  +  dx.  If  x^,  X2,  ...,  Xn 
be  the  observed  values,  we  have  to  make  the  product 

f{xi  —  x')f(x2  —  x)  ...  f{xn  —  ^0  a  maximum, 

or  the  sum 

S  \ogf{xr  —  x)  a  maximum. 

Differentiating  with  respect  to  x\  we  obtain  the  condition  for 
a  maximum  in  the  form 

Let  F(ii)='^, 

where  u  is  the  error  Xr  —  x  . 

Then  Si^(w)  =  0  whenever  the  numbers  of  positive  and  nega- 
tive errors  are  equal.     This  is  satisfied  by  making 

F{u)=±k, 

where  the  upper  or  lower  sign  is  to  be  taken  according  as  w  is 
positive  or  negative. 
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f 
Writing  y=±k 

and  integrating,  we  find 

\ogf(n)  =  ±  kit  +  constant  =  k\u\-\-  constant, 
and  /(i^)=Cg^l"l. 

Since  the  probability  must  decrease  as  u  increases  it  follows 
that  k  is  negative,  and  the  form  of  the  error  law  is 

Since  any  error  must  lie  between  —  oo  and  +  x  ,  we  have,  as  in  the 
case  of  the  Normal  Law, 

1=         f(u)du  =  2G      e-^'^'du^^, 
and  finally  we  may  write 


EXAMPLES. 


1.     Show  that  the  average  value  of  the  error  ix]  of  page  30)  is  ^  when 


the  above  law  holds. 

2.  Show  that  the  above  law  yields  as  the  most  probable  value  of  the 
unknown  that  which  makes  the  arithmetic  sum  of  the  errors  a  minimum. 

3.  Given  that  the  arithmetic  sum  of  the  errors  is  to  be  a  minimum, 
show  that  the  median  is  the  most  probable  value  of  the  unknown. 

4.  Find  the  p.e.  s,  a  single  observation. 
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CHAPTEE   III 

THE    CASE    OF   ONE    UNKNOWN 

17.     Measures  of  Precision. 

The  area  bounded  by  the  axis  of  x,  and  the  curve 

VTT 

is  unity  for  all  values  of  h.  The  greater  the  value  of  h,  the  greater 
is  the  central  ordinate,  and  the  steeper  the  curve ;  i.e.  the  greater 
the  value  of  h,  the  more  closely  will  the  observations  be  clustered 
about  the  mean  value.  If  two  sets  of  observations  of  the  same 
quantity  be  made,  the  series  for  which  h  is  greater  will  be  more 
closely  clustered  about  the  mean  value,  and  may  therefore  be 
regarded  as  a  better  set  of  observations  than  the  set  for  which  h 
is  less.  Hence  h  may  be  regarded  as  a  measure  of  the  precision 
of  the  observations,  and  may  be  used  as  a  criterion  for  judging 
the  accuracy  with  which  the  observations  have  been  carried  out. 

In  practice  it  is  found  more  convenient  to  use  certain  functions 
of  A,  rather  than  h  itself,  as  measures  of  precision.  Most  continental 
writers  use  a  function  which  is  termed  the  mean  square  er^^or  or 
M.S.E.,  a  term  which  is  loosely  used  to  denote  the  square  root  of  the 
average  value  of  the  squares  of  the  errors  of  observation.  It  is 
usually  denoted  by  the  Greek  letter  fi. 

Of  a  set  of  n  observations,  the  number  whose  errors  will  lie 
between  A  and  A  +  c^zl  will  be 

-=e    "^  clA. 

vrrr 

The  sum  of  the  squares  of  these  errors  will  be 

nh  70*0 


V 


TT 
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If  [1  be  the  M.s.E.,  it  follows  that  the  sum  of  the  squares  of  all  the 
errors  from  —  oo  to  +  oo 


=  nfj? 

nh 

/: 

00 

''A'd/\ 

2nh 

e' 

Jo 

-¥£ 

^'A^cZA 

n 
~2h^' 

•••  /^'  =  or.     and     fi  = 


2A^  "^     V2A* 

English  and  American  writers  generally  use  as  a  measure  of 
precision  a  quantity  r,  unhappily  named  the  probable  error  or  P.E.* 
The  P.E.,  r,  is  of  such  a  magnitude  that  the  error  of  a  single 
observation  is  as  likely  to  be  within  as  without  the  limits  ±  r. 
Or,  to  express  it  in  another  way,  the  odds  are  even  that  the  error 
of  a  single  observation  shall  not  be  greater  in  magnitude  than  r. 
This  condition  is  expressed  mathematically  by  the  equation 

^  =  7^/      e~^^^^dx  =  —=  I     e'^'^dx. 

^        \I'K  J  -r  VTT  Jo 

This  equation  can  be  solved  by  the  use  of  tables  of  the  integral 
e~^  dx,  yielding  the  result 

Ar  =  0-47696  =  /3. 

The  quantity  p  is  the  one  mentioned  on  page  19. 
It  follows  that 

0-47696 


/, 


r  = 


h 


=  0"6745//-  =  f /^  (approximately). 


The  P.E.  has  a  very  simple  meaning,  in  that  one-half  of  the 
observations  in  a  series  should  have  errors  greater  than  r,  and  the 
other  half  should  have  errors  less  than  r. 

Another  measure  of  precision  which  is  sometimes  used  is  the 
average  error  rj,  whose  value  is  the  average  of  the  errors  of  all  the 

*  The  most  probable  error  is  zero,  corresponding  to  the  highest  point  of  the 
error  curve. 
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observations,  considered  without  regard  to  sign.     Its  relation  to  h 
can  be  easily  derived. 


A/fTT-       /   fl 


V  TT  .'  0  V  TT  J  0 


97  =  "7=        xe'^'^^^dx 

Nir  .'0 


h  Vtt 
It  follows  that 

and 


=  ^x^|, 


r  =  0  6745/^  =  0-84537?. 


The  relations  of  y-j    ^j  /^,   and  77   are   here  collected   in   tabular 
form  for  convenience  of  reference. 


1 

h 

/J- 

r 

V 

1 

h 

1-0000 

1-4142 

2^0966 

r7726 

H- 

•7071 

1  -0000 

1^4826 

1^2533 

r 

•4769 

•6745 

1-0000 

•8453 

V 

•5642 

•7979 

1^829 

1  -0000 

The  following  approximate  values  of  some  of  the  above  coefficients  are 
occasionally  of  use  : 

•6745   =%. 

•8453   =1^. 

•47696  =  *^. 


18.     Evaluation  of  h  and  r. 

The  relation  between  Ii  and  the  residuals  can  be  obtained 
as  follows.  Let  71  similarly  observed  values  of  a  quantity  be 
a?!,  X.2,  x-i,  ...,  Xn,  and  let  X  be  the  true  value  of  the  quantity. 
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Then  the  errors  of  the  separate  observations  are  x^  —  X,  x^  —  X, 
...,  Xn  —  X.  The  probability  of  an  error  lying  between  x  and 
X  +  dx  is 

^e-^'^'dx. 

As  on  page  22  this  may  be  expressed  by  saying  that  the  probability 
of  an  error  x  is  Che  ~"'  ^  ^  where  (7  is  a  constant.  Then  the  a  priori 
probability  of  the  occurrence  of  the  system  of  errors  x^  —  X,  x^  —  X, 
etc.  is 

or  Qn},ne-h^^v,^^-hH{M-Xr^ 

2x 
where  M  =  —    and    Vr  =  Xr—  M. 

n 

The  true  value  of  X  is  unknown.  All  that  can  definitely  be 
said  is  that  it  lies  between  —  oo  and  +  oo  .  Thus  the  probability 
of  the  given  system  of  measurements  x-^,  x^,  ...,Xn  must  be  written 
as  the  integral 

J    -00 

and  this  can  immediately  be  reduced  to 

V   n 

The  value  of  h  must  be  such  as  to  make  this  probability  a 
maximum.  Taking  logarithms  and  differentiating  with  respect 
to  h,  we  find 

h 

n—ln—1  I 


''''  ^'~  2Xv'~2[vvy 

In  writings  on  the  present  subject  it  is  customary  to  denote 
summation,  not  by  the  letter  2,  but  by  square  brackets.  The  last 
equation  reduces  to 

It  immediately  follows  that 


r  =  0-47696  I  =  06745  a /-^^  • 
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19.  Comparison  of  a  set  of  observations  with  the 
preceding  theory. 

Gauss  ( Werke,  iv.  p.  116)  took  Bessel's  reduction  of  470  observa- 
tions of  the  right  ascensions  of  Procyon  and  Altair  made  by  Bradley, 
and  compared  the  distribution  of  errors  with  the  theoretical  curve 
obtained  by  evaluating  h  by  the  above  formula.  He  calculated 
the  numbers  of  observations  whose  errors  should  be  numerically 
between  0''0  and  0"*1,  between  0'''1  and  0"'2,  etc.,  and  compared 
them  with  the  actual  numbers  obtained  from  Bradley's  observations. 
The  results  are  given  in  the  following  table. 


Theoretical 

Actual 

Errors 

number 

number 

0"-0— 0"-l 

94-8 

94 

0"-l_0"-2 

88-8 

88 

0"-2— 0"-3 

78-3 

78 

0"'3— 0"-4 

64-1 

58 

0"-4— 0"-5 

49-5 

51 

0"-5— 0"-6 

35-8 

36 

0"-6— 0"-7 

24-2 

26 

0"-7— 0"-8 

15-4 

14 

0"-8-0"-9 

9-1 

10 

0"-9-l"-0 

5-0 

7 

above  1"*0 

5-0 

8 

The  table  shows  a  remarkable  correspondence  between  the 
theory  and  the  observational  data.  There  is,  however,  a  slight 
discrepancy  in  the  number  of  large  errors,  the  number  occurring 
in  practice  exceeding  the  theoretical  number.  This  discrepancy 
occurs  in  other  series  of  observations,  and  some  attempt  to  deal 
with  it  will  be  made  in  a  later  chapter. 

20.     Evaluation  of  fi. 

The  quantity  /j,,  the  M.  s.  E.  of  a  system  of  errors,  is  connected 
with  h  by  the  relation  i 


h\/2 


It  has  already  been  shown  that  h 


be  given  by 


/n-1 
""  V  2  [vv] 


and  so  fi  should 


/  [vv] 


B.  O. 
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This  relation  is  of  such  importance  that  it  is  necessary  to  con- 
sider it  in  some  detail.  The  residuals  Vi,  v.2,  etc.  are  the  deviations 
of  the  observed  values  from  the  A.M.,  and  if  the  A.M.  could  be 
definitely  regarded  as  the  true  value  of  the  unknown,  the  (m.s.e.)- 

ought    to  be  equal  to  - — - .     The  use  of  the  denominator  n—1 

instead  of  ?i  must  be  regarded  as  due  to  the  uncertainty  as  to  the 
true  value  of  the  unknown.  The  derivation  of  the  formula  for  fju 
from  a  purely  algebraic  standpoint  may  help  to  elucidate  the 
question. 

Let  Xi,cco,  ...,0Cnhe  the  n  observed  values.  Now  suppose  these 
to  be  the  first  n  observations  in  a  series  of  N  observations.  Let 
Fi,  V2,...,  Fy  be  the  residuals  calculated  from  the  mean  of  the 
JS^  observations. 

Let  /^^  =  V'       ^^=V-  • 

If  Vs  be  the  residual  of  the  sth  observation  within  the  group  of 
n  observations,  then 

Vs  =  d+Vs   (1), 

where  d  is  difference  between  the  mean  of  the  N  observations  and 
the  mean  of  the  group  of  n.     The  value  of  d  is  given  by 

^^Fi+  K  +  ...  +  F,, ^ ^2). 

Squaring  equation  (2)  we  find 

There  will  be  -|/i(/i  — 1)  terms  of  the  form  V^Vt,  and  the 
equation  may  be  ^^Titten 

\    n  n  —  \ 

cZ'-  =  —  2)  F/  -\ X  mean  value  of  FgF^  (3). 

n-  1  n 

If  a  large  number  of  different  groups  of  n  be  selected  from 
among  the  total  of  N  observations,  and  equation  (3)  be  formed  for 
each  group,  we  can  write  down  the  mean  value  of  each  side  of 
equation  (3)  for  all  these  groups. 

The  resulting  equation  is 

d'  =  ~  -\ X  mean  value  of  VsVt (4), 

n  n 

where  d'^  is  the  mean  value  of  d'-. 
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But    o  =  (V,  +  v,+  ...  +  Vj^y 

=  [VV]  +  X{X-1)  X  mean  value  of  FJ^f . 

.-.  mean  value  of  VsVt  =  -  j^.^_  -^   =  -  ^fZTi   ' ' ^^^'^• 
Substituting  this  result  in  equation  (4)  we  find 

._/x^      n-1      IT     _fir{X-n) 

"^  ~  n  n     N-l'niX-l)    ^^^• 

Now  F,2  =  (cZ  +  VsY  =  fZ  -  +  2c??;,  +  ?■,-. 

There  will  be  n  such  equations,  and  if  we  write  these  n  equations 
down  and  take  the  mean  value  of  both  sides,  we  find 

mean  value  of  Vs'  =  d-  +  jj!'-    (6). 

If  we  form  equation  (6)  for  all  the  possible  sets  of  n  obser- 
vations, and  then  take  the  mean  of  all  these  equations,  we  find 

fjL-  =  d'  +  (Jb'-, 
or,  substituting  for  d'^  from  equation  (5), 

1^'  X—  n 

^"='^"^rT^ ^^)- 

If  we  have  a  very  large  number  X  of  observations,  and  we  take 
samples  of  n  at  a  time,  fi^om  among  the  number  X,  the  average 
value  of  the  (mean-square-residual)-  for  all  the  possible  samples 
is  connected  with  the  mean  square  residual  of  all  the  X  observa- 
tions (/Lt),  by  equation  (7). 

If  X  be  regarded  as  becoming  an  indefinitely  large  number, 
the  mean  of  the  X  observed  values  may  be  taken  as  the  true  value 
of  the  unknown,  and  fi  is  then  the  M.  s.  E.  of  a  single  observation. 
Equation  (7)  then  yields 

n       ,.,         11     \yv'\       \yv'\ 


1  ^        n  —  \    n        n  —  \ 


Whence 


/  W 

V  n-r 


The  formula  {n  —  l)fi-  =  \_vv^  may  be  interpreted  as  follows. 
If  we  take  an  infinite  series  of  observations  of  any  quantity,  and 
select  a  large  number  of  samples  of  n  observations,  the  sum  of 

3—2 
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the  squares  of  the  n  residuals  calculated  from  the  mean  of  the 
n  observations  will  have  the  mean  value  {n—  1 )  /jl\  The  factor  n  —  1 
is  due  to  the  fact  that  the  mean  value  of  the  observations  in  any 
one  sample  is  liable  to  differ  from  the  true  value  of  the  unknown. 

21.  Probable  error  and  mean  square  error  of  the 
arithmetic  mean. 

Returning  to  equation  (5)  above,  d  is  the  M.  s.  E.  of  the  mean 
of  n  observations.  If  we  take  N  to  be  an  infinitely  large  number, 
/i  becomes  the  M.  s.  e.  of  a  single  observation.  Equation  (5)  may 
then  be  written 

^  M. s.  E.  of  a  sinHe  observation 

or  M.  S.  E.  of  A.  M.  =  — 1= . 

Since  the  relation  r  =  0*6745/^  is  always  true,  we  obtain  imme- 
diately the  result 

^              P.  E.  of  a  sinde  observation       r 
p.  E.  of  A.  M.  = ^ =  —= 

wn  \in 


0-6745  ./^^. 
V  nin-1) 


{n  -  1) 

22.  Probable  error  of  a  linear  function  of  a  number  of 
independent  quantities  whose  probable  errors  are  known*. 

In  the  first  case  we  shall  consider  a  linear  function  of  two 
independent  quantities  in^,m^,  whose  mean  square  errors  are  /Xi,  /Xo. 
Let  the  linear  function  be 

where  a^  and  a^  are  constants. 

If  an  error  x  be  made  in  determining  mj,  and  an  error  y  in 
determining  mg,  the  corresponding  error  dF  in  F  is  given  by 

dF  =  a^x  ■{■  a^y. 

Squaring  this  equation,  we  obtain 

dF'^  =  a^-x^  +  a^^y^  +  2a-^a^xy. 

*  This  proof  assumes  that  the  error  law  of  a  linear  function  of  a  number  of 
independent  quantities  is  of  the  same  form  as  the  error  law  of  each  of  these 
quantities.     Vide  §  15  for  proof  of  this  assumption. 
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This  equation  holds  for  all  values  of  x  and  y\  and  therefore  the 
mean  value  of  the  L.  H.  s.  is  equal  to  the  mean  value  of  the  R.  H.  s. 
But  if  yu,,  yLtj,  /Xo  be  the  M.  s.  E.'s  of  F,  m^,  and  m.^,  respectively,  the 
equation  leads  to 

yu,2  =  a-cfjL-c  +  a.i-^JL2'■  +  2aia.2  x  mean  value  of  xy. 
Since  m^  and  iiu  are  independent,  the  errors  x  and  y  are  also 
independent.     With  a  given  value   of  x,  positive   and   negative 
values  of  y  are  equally  likely  to  be  associated ;  and  so  the  mean 
value  of  the  product  xy  is  zero. 

. ' .    /X^  =  Oi^/Xi^-f-  a.^iJLo\ 

This  result  may  be  extended  to  apply  to  any  number  of 
variables,  the  proof  being  the  same  as  that  given  above  for  two 
variables.     In  general,  if 

F=  a-^m^  +  a2??i2  +  . . .  +  a^^m^, 
where  ofj,  a^,  ...,  an  are  constant,  then 

fir  =  cij'fMi^  +  ftsW  +  . . .  +  «„>,;-, 
or,  it  r,  ri,  7\,  ...,Tn  be  the  P. E.'s  of  i'',  nii,  m.,,  ...,mn, 

r-  =  a^-r^"^  +  a^^Vo"  +  . . .  +  a,,^,;-. 

Probable  error  of  the  Arithmetic  Mean. 

The  P.  E.  of  the  A.  M.  can  be  immediately  derived  from  the 
above.  For  if  Xi,  x.^,  ...,Xn  be  n  independently  observed  values, 
each  with  P.  E.  r,  and  if  r'  be  the  P.  E.  of  the  A.  M.  x,  then 

-      1/ 

^  =  -  (^1  +  ^2  +  •  •  •  +  ^n), 

1  r^ 

and  r''^  =  —  (r-  +  r^  +  etc.  to  n  terms)  =  - . 

n^  n 

r 
The  P.  E.  of  the  A.  M.  is  therefore  — = . 

vn 

If  /x   be   the    M.  s.  E.  of  a  single    observation,  the    M.S.E.    of 


the  A.M.  is  -^. 


Exercise.     Derive  the  above  results  from  the  formulae  given  at  the  end 
of  §  15. 
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23.     Peters^  Formula  for  r. 

Let  x-y,  ii'2, ...,  ^n  be  n  observed  values  of  a  quantity  whose 
true  value  is  x.  Then  the  true  errors  61,  €3,  ...,e^i  are  given  by 
the  equations 

^i  =  £c  — 61,     ^2  =  ^  —  ^25  etc. 

The  A.  M.  of  the  observed  values  is 

1  . 

yXy-r-  Xo_  +  .  . .  +  Xji), 

n 

or  X (ei  +  e^  +  . . .  +  6„). 

If  Vi,  V2,  etc.  be  the  deviations  of  ^1,  Xo,  etc.  from  the  mean, 

Vl  =  -  {{n  -  1)  61  -  eo  -  6.3-  ...  -  €n}, 

V2  =  -{-€y  +  {n  —  l)  €.2-63-  ...-  €n},  etc. 

All  the  n  observations  are  supposed  to  be  liable  to  the  same 
errors,  or  are  supposed  to  follow  the  same  error  law.  And  since 
the  residuals  V:^,  Vo,  etc.  are  linear  functions  of  ei,  62,  etc.,  it  follows 
from  §  15  that  the  residuals  are  subject  to  a  similar  error  law, 
the  parameter  h  being  the  same  for  all  the  residuals,  since  the 
sum  of  the  squares  of  the  coefficients  on  the  R.  H.  s.  is  the  same 
for  all  y's. 

Let  the  P.  E.  of  each  e  be  r,  and  let  the  P.  E.  of  any  residual  v 
be  r.     Then  it  follows  from  §  22  that 

r'^=g{('^-iy  +  0.-l)l  =  r^^   (1). 


'W^l 


Hence  the  true  P.  E.  r  of  the  observations  is  r'  ^  / ^  ,  where 

r'  is  the  P.  E.  derived  from  the  residuals. 

If  v\  be  the  sum  of  the  residuals,  taken  without  regard  to 
sign,  then 

n 


/  =  0-8453^.  (See  page  31.) 


.-.    r  =  0-8453  7==,  (2). 

\/n{n- 1)  ^ 

Equation  (2)  is  known  as  Peters'  formula  for  the  probable  error. 
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It  should  be  noted  that  equation  (1)  above  leads  also  to  the 
ordinary  formula  for  the  P.  E.  derived  from  the  squares  of  the 
residuals.     The  P.  E.  of  a  residual 

/M 

V     n    ' 


•'=0-6745 


=  /    /    -     =0-6745yM 


.(3). 


We  thus  have  two  equations  for  the  evaluation  of  the  P.  E.  of 
a  single  observation  from  the  residuals. 

24.     Some    Examples    of  the   Adjustment  of  Observa- 
tions of  one  unknown. 

Example  1.  The  heat  of  evaporation  of  water*. 
Each  of  the  20  values  given  in  the  first  column  of  the  table  below  is  an 
independent  determination  of  the  heat  of  evaporation  of  water.  It  is  required 
to  find  the  adjusted  value,  and  its  p.e.  The  a.m.  of  all  the  observations  is 
adopted  as  the  most  probable  value.  The  residuals  obtained  by  subtracting 
the  A.M.  from  each  determination  are  written  in  the  second  column,  the 
l^ositive  and  negative  residuals  being  separated  into  sub-columns.  The  \  squares 
of  the  residuals  are  written  in  the  last  column. 


Observed  value 

Piesidual              \  squai 

1 

e  of  residual 

+ 

— 

542-98 

1-056 

2788 

1-23 

•694 

1204 

0-64 

1-284 

4122 

2-03 

•106 

0028 

2-32 

•396 

0392 

1-48 

•444 

0493 

2-37 

•446 

0497 

2-15 

•226 

0128 

1-36 

•564 

0795 

1-34 

•584     1 

0853 

2-91 

•986 

1 

2430 

2-68 

•756 

1 

1429 

3-08 

1-156 

3341 

2-12 

■196 

0096 

1-82 

•104 

0027 

0-96 

•964 

2323 

1-66 

•264 

0174 

1-73 

•194 

0094 

1-79 

•134 

0045 

1-83 

•094 

0022 

Mean  541-924 

+  5-324 

-5-324                2 

1281 

A.  W.  Smith,  Fliijsical  Revieic,  September,  1911. 
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The  correspondence  of  the  sums  of  positive  and  negative  residuals  affords 
a  check  on  the  vahie  of  the  A.  m. 

[vy]  =  2-1281  X  4  =  8-5124. 

Using  the  formulae  of  pages  32  and  36,  we  find 

p.E.  of  a  single  observation  ='6745  ^  /  ='451, 

'451 
p.E.  of  A.M.  =—7^;:== -101. 

V20 

Using  Peters'  formula  we  find 

P.  E.  of  a  single  observation  =  -8453  x    . =  -462, 

^20x19 

'462 
P.  E.  of  A.  M.  =  -r=  =  -103. 

^^20 

The  two  formulae  for  p.  e.  yield  almost  identical  results.     The  final  value 
of  the  unknown,  together  with  its  p.  e.,  can  be  represented  as 

541 '924  ±'102. 

The  residuals  of  column  2  are  shown  in  figure  4.     The  w^hole  range  of 
variation  is  divided  up  into  intervals  of  '4,  the  middle  interval  extending 


from  —  '2  to  +  '2.  In  the  diagram  a  dot  is  put  in  to  represent,  as  accurately 
as  possible,  the  value  of  each  residual.  An  ordinate  is  erected  at  the  middle 
of  each  interval,  of  a  length  proportional  to  the  number  of  observations  falling 
within  that  interval.  The  top  of  each  ordinate  is  represented  by  a  x .  A  smooth 
symmetrical  curve  is  drawn  to  fit  the  tops  as  closely  as  possible.  Although 
two  of  the  points  representing  positive  residuals  are  at  some  distance  from 
the  curve,  it  is  seen  that  the  general  form  of  the  curve  is  in  fair  agreement 
w^ith  the  Gauss  error  curve  shown  in  figure  3.  The  discrepancy  is  possibly 
due  to  the  smallness  of  the  number  of  determinations  used  in  the  reduction. 
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In  the  diagram  two  ordinates  are  erected  at  distances  ±"451  from  the 
origin.  According  to  the  least  square  theory,  10  observed  values  should  lie 
between  these  ordinates,  and  10  outside  these  limits.  It  is  seen  that  11 
observed  values  lie  within  the  given  limits,  and  9  without;  a  result  in 
sufficiently  good  agreement  with  the  theory,  in  view  of  the  relative  smallness 
of  the  number  of  determinations. 

Example  2.     The  atomic  iceight  of  bromine*. 

The  ratio  of  the  weights  of  Bromine  and  Hydrogen  which  combine  to  form 
Hydrobromic  acid  was  determined  experimentally.  The  results  of  10  inde- 
pendent determinations  are  given  in  the  first  column  of  the  table  below. 


Observed  value 

Kesidual 

1  square  of  residual  | 

+ 

79-2863 

■0204 

•00010404 

•3055 

•0012 

36 

•3064 

•0003 

2 

•3197 

•0130 

4225 

•3114 

•0047 

552 

•3150 

•0083 

1722 

•3063 

•0004 

4 

•3141 

•0074 

1369 

•2915 

•0152 

5776 

•3108 

•0041 

420 

1 

Mean  79^3067 

•0375         -0375 

•00024510 

p.  E.  of  a  sincjle  determination 


[i-v]  = -0009804,     v]=-075. 


•0009804 


9 


X -6745  =-0070, 


p.E.  of  the  A.M.=^52I?=.002^2, 
x/10 

r.E.  of  A.M.  from  Peters'  formula  =  -7=^  x-8453  x— _^='0021. 

VIO  v90 

The  adjusted  value  of  the  ratio  is  thus  79*3067  ± '002. 

The  atomic  weight  of  Hydrogen  =  1  "00779. 

Therefore  the  atomic  weight  of  Bromine 

=  1  "00779  (79-3067  ±  -002;  =  79-924  ±  -002. 


*  Weber,  Bulletin  of  Bureau  of  Standards,  Vol.  ix,  p.  131. 
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Example  3.     The  percentage  of  dry  matter  in  mangel  roots*. 

The  percentage  of  dry  matter  was  estimated  for  each  of  160  roots  of  a 
straiu  of  Golden  Globe  mangel.  The  results  varied  between  10-7  7„  and  197  7„, 
and  the  a.m.  of  all  the  results  gave  14*5°/^.  It  was  necessary  to  consider 
whether  it  was  justifiable  to  take  the  a.m.  of  such  widely  differing  results,  and 
if  so,  what  was  its  precision.  In  order  to  find  the  answer  to  this  question 
we  must  first  consider  what  are  the  different  causes  which  tend  to  produce 
differences  in  content  of  dry  matter  in  individual  roots ;  and  whether  these 
causes  satisfy  the  assumptions  made  on  page  5  as  to  the  nature  of  accidental 
errors. 

All  the  roots  taken  were  of  the  same  strain,  grown  side  by  side,  and 
sampled  and  analysed  in  the  same  manner.  Thus  the  possibilities  of  variation 
in  individual  roots  were  reduced  to  a  minimum.  But  there  still  remained 
certain  possible  causes  of  variation.  Since  mangels  are  easily  cross-fertilised, 
a  commercial  strain  wnll  not  be  a  pure  one,  and  so  varying  parentage  may  be 
a  possible  cause  of  variation  in  constitution.  The  slight  differences  in  soil 
will  vary  the  food  supply;  the  hoeing  will  not  be  absolutely  regular;  the 
distribution  of  manure  will  not  be  perfectly  uniform,  and  slight  errors  may 
enter  into  the  analyses  of  the  roots.  Each  of  these  separate  causes  is  equally 
likely  to  make  the  percentage  of  dry  matter  in  an  individual  root  higher  or 
lower  than  the  average  value.  In  the  majority  of  cases,  some  of  these  causes 
will  tend  to  raise,  and  some  to  lower  the  result.  It  will  only  happen  in 
relatively  few  cases  that  all  the  causes  of  variation  act  in  the  same  direction, 
yielding  a  result  differing  considerably  from  the  mean  value,  A  curve  show- 
ing the  frequencies  of  different  percentages  of  dry  matter  should  thus  have  its 
maximum  at  the  mean  value  of  the  percentage,  and  should  be  symmetrical 
about  the  mean  value.  It  might,  in  fact,  be  expected  to  yield  a  curve  of  the 
same  general  form  as  the  error  curve  shown  in  figure  3. 

The  actual  distribution  of  frequencies  is  shown  in  figure  5.  In  this 
diagram,  the  percentage  of  dry  matter  is  represented  along  the  horizontal 
axis,  and  the  number  of  roots  along  the  vertical  axis.  For  each  root  a 
dot  is  placed  in  the  diagram  above  the  corresponding  percentage.  The 
whole  range  of  variation  is  divided  into  intervals,  from  10 — 11  °/^,  11 — 12°/^, 
etc.,  and  an  ordinate  is  erected  at  the  middle  point  of  each  section,  of  a 
length  proportional  to  the  number  of  dots  in  that  section.  The  tops  of  these 
ordinates  are  represented  in  the  diagram  by  crosses.  It  is  clear  from 
inspection  that  it  would  be  possible  to  draw  a  smooth  symmetrical  curve 
passing  very  near  to  the  tops  of  the  ordinates.  The  form  of  such  a  curve 
would  be  in  good  agreement  with  that  of  the  ideal  error  curve  shown  in 
figure  3.  To  test  this  more  closely,  the  Gauss  error  curve  was  drawn  for 
comparison  with  the  actual  distribution.     This  curve  is  the  smooth  curve 

*  Wood  and  Stratton,  "  The  Interpretation  of  Experimental  Kesults,"  Journal 
of  Agricultural  Science,  Vol.  iii,  part  4. 
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shown  in  figure  5,  The  method  of  construction  of  this  curve  will  be  ex- 
plained later  on,  but  meanwhile  we  may  note  that  the  crosses  representing 
the  actual  distribution  all  lie  very  near  to  the  curve. 
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Percentage  of  dry  matter. 

Fig.  5. 

The  p.E.  of  a  single  determination  is  given  as  1*1  7o*      Thus  the  p.e.  of 
the  A.M.  is    ■ =  0-09.     In  the  diagram,  ordinates  are  drawn  on  each  side  of 

Vieo 

the  mean,  distant  Tl  from  it.  According  to  the  theoretical  discussion,  80  of 
the  observed  values  should  lie  within,  and  80  outside  these  limits.  A  count 
of  the  dots  shows  that  81  observations  are  within  I'l  °/^  of  the  mean,  and  79 
outside  these  limits — a  result  in  excellent  agreement  with  the  demands  of 
the  theory. 

The  equation  of  the  normal  error  curve  is 


where 


h  = 


•47696      -47696 


=  •43. 


r  1-1 

A  sufficient  number  of  ordinates  to  enable  us  to  draw  the  curve  can  easily  be 
evaluated  by  means  of  a  table  of  logarithms. 

In  the  preceding  example,  the  symmetry  of  the  distribution  of  frequencies 
about  the  a.m.  appears  to  justify  the  assumption  that  the  different  causes 
of  variation  in  content  of  individual  roots  were  equally  likely  to  raise  as  to 
lower  the  result.  In  general,  when  some  of  the  causes  of  variation  are 
dissymmetrical  in  their  action,  tending  to  give  high  results  oftener  than  low 
results,  or  vice  versa,  the  frequency  curve  is  unsymmetrical,  even  when  the 


*  Loc.  cit. 
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number  of  observations  is  sufficiently  large  to  give  the  laws  of  chance  fair 
play.  If  the  number  of  observations  be  large,  and  the  curve  unsym metrical, 
it  is  generally  safe  to  assume  that  there  is  at  work  some  cause  which  tends 
to  act  always  in  one  direction,  giving  an  abnormal  number  of  high  or  of  low 
results.     Such  a  case  is  discussed  in  Example  4  below. 


Example  4.     The  average  weight  of  a  number  of  mangel  roots*. 

The  figure  shows  the  distribution  of  the  weights  of  196  roots.  In  this 
case  no  smooth  curve  has  been  drawn,  but  the  tops  of  successive  ordinates 
have  been  joined  by  straight  lines.  The  resulting  curve  is  clearly  dissym- 
metrical, showing  that  either  an  abnormally  large  number  of  large  roots,  or 
an  abnormally  small  number  of  small  roots,  was  taken.  The  latter  alter- 
native probably  affords  the  true  explanation,  as  the  weak  plants  which  would 
produce  small  roots  would  be  destroyed  in  the  process  of  hoeing  and  singling ; 


4    4-5     5    5-5     6     6-5 
Weight  of  roots  in  lbs. 
Fig.  6. 


7     7-5    8    8-5 


and  it  is  possible  that  the  very  small  roots  would  be  unconsciously  passed 
over  in  sampling.  The  true  curve  of  frequency  should  therefore  extend 
further  in  the  direction  of  small  weights.  The  effect  of  the  absence  of  small 
roots  is  to  give  an  apparent  mean  weight  greater  than  the  true  value.  The 
method  of  least  squares  is  not  strictly  applicable  to  such  a  distribution  as  is 
here  shown,  as  the  theory  demands  a  distribution  of  frequencies  which  shall 
be  symmetrical  about  the  mean. 

*  Wood  and  Stratton,  lac.  cit. 
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Example  5.     Magnitude-interval  of  a  parallel-wire  grating*. 

A  parallel-wire  grating  consists  of  a  series  of  equidistant  parallel  wires 
fitted  on  a  frame.  When  this  is  set  over  the  object  glass  of  a  telescope,  it 
acts  as  a  difi^raction-grating,  so  that  each  star,  instead  of  producing  a  single 
dot  upoo  the  photographic  plate,  produces  a  central  bright  dot,  with  a  series  of 
dots  of  decreasing  brightness  on  each  side  of  it.  The  dots  are  in  reality  small 
spectra,  but  on  account  of  the  smallness  of  the  dispersion  they  are  sensibly  round. 
The  proportion  of  the  light  of  a  star  which  is  deviated  into  an  image  of  a 
given  order  is  constant,  depending  only  on  the  form  of  the  grating.  There  is 
accordingly  a  definite  magnitude-interval  between  a  central  image  and  the 
auxiliary  image  on  each  side  of  it.  There  will  be  this  same  magnitude- 
interval  between  a  given  star  and  another  star  whose  central  image  is  equal 
in  size  and  greyness  to  the  first  diffraction  image  of  the  first  star.  Thus  when 
the  magnitude-interval  between  consecutive  diffraction  images  is  known,  the 
magnitudes  of  all  the  stars  on  the  plate  can  be  determined,  provided  there 
are  on  the  plate  some  stars  whose  magnitudes  are  known.  The  adjoining 
table  shows  a  series  of  determinations  of  the  magnitude-interval  of  a  grating, 
based  on  measurements  of  a  number  of  stars  upon  plates  with  different 
exposures. 


No.  of  plate... 

4996 

5014 

5061 

4922 

4923 

4924 

4997 

5059 

5070 

5028 

Magnitude 

ot  star 

8-89 

2-68 

2-66 

3-09 

2-66 

2-73 

2-31 

2-65 

2-67 

2-94 



10-42 

2-67 

2-90 

2-62 

2-67 

2-75 

2-86 

2-98 

2-60 

2-68 

2-57 

10-54 

2-74 

2-86 

2-84 

2-68 

2-66 

— 

2-67 

2-46 

2-60 

2-75 

10-62 

2-56 

2-54 

2-64 

2-78 

2-62 

2-69 

2-72 

2-65 

2-59 

— 

10-64 

2-49 

2-58 

2-77 

2-70 

2-70 

2-77 

2-66 

2-72 

2-50 

— 

10-64 

2-56 

2-69 

2-62 

2-70 

2-70 

2-73 

2-66 

2-61 

2-46 

— 

10-66 







2-76 

2-63 

2-64 

— 

— 

— 

— 

10-69 





2-72 

2-74 

2-66 

2-94 

— 

— 

2-82 

— 

10-90 

2-78 

2-73 

2-91 

2-64 

3-03 

— 

— 

— 

— 

10-94 

2-82 

— 

2-85 

— 

— 

— 

— 

— 

— 

10-95 

2-79 

2-95 

2-77 

2-60 

2-59 

2-99 

— 

— 

— 

— 

11-01 



2-69 

2-89 

— 

— 

— 

— 

— 

— 

— 

11-03 

2-76 

2-79 

2-62 

2-55 

2-97 

— 

— 

— 

— 

— 

11-12 

2-65 

— 

2-80 

2-41 

2-87 

2-86 

— 

— 

—  ■ 

11-26 

2-67 

— 

2-77 

2-32 

2-87 

— 

— 

— 

— 

— 

11-47 

300 

— 

— 

— 

— 

— 

— 

— 

— 

— 

11-57 

2-68 

— 

— 

— 

— 

— 

— 

— 

— 

— 

11-97 

2-63 

— 

— 

— 

— 

— 

— 

— 

— 

— 

12-12 

2-54 

— 

— 

— 

— 

— 

— 

— 

— 

— 

12-25 

2-83 

— 

— 

— 

— 

— 

— 

— 

— 

— 

12-32 

2  79 

— 

— 

— 

— 

— 

— 

" 

~ 

Chapman  and  Melotte,  Monthly  Noticei^,  R.A.S.,  November,  1913. 
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The  mean  of  98  determinations  =  2™ '71. 

The  p.  E.  of  a  single  estimate  calculated  from  the  squares  of  the  residuals 

=  ±0°i-097. 

•097 
The  P.E.  of  the  mean  value  is  — r^=  =  "0097. 

V98 

The  final  result  may  thus  be  written 

2°i-71  ±  -0097. 

The  P.E.  of  a  single  estimate  calculated  from  the  sum  of  the  residuals 
is  -094.    s  .  i<~ 

The  frequency  curve  is  shown  in  figure  7.  It  is  seen  that  the  a.  m/  does 
not  coincide  with  the  maximum  ordinate.  The  latter  occurs  at  about  2'"-67, 
whereas  the  a.m.  is  2^-11.     The  number  of  measurements  represented  by  the 


Magnitude-interval. 
Fig.   7. 

curve  is  98,  and  this  appears  to  be  a  sufficiently  large  number  to  yield  an 
accurate  representation  of  the  true  nature  of  the  curve  of  errors.  We  are 
forced  to  conclude  that  there  is  present  some  cause  of  dissymmetry. 

The' P.E.  of  a  single  observation  being  0™'097,  two  ordinates  are  drawn, 
one  on  each  side  of  the  a.m.,  and  at  a  distance  of  0°i*097  from  it.  It  is  found 
that  57  measurements  lie  between  these  limits,  and  41  outside  them.  Theory 
would  require  49  observations  to  lie  within  the  limits  ±0'^'097,  and  49 
outside  these  limits.  It  is  thus  no  longer  strictly  possible  to  attach  the 
original  meaning  to  the  p.e.  deduced  from  the  residuals  on  the  assumption 
that  the  mean  is  the  true  value  of  the  unknown. 

When  the  curve  of  frequencies  is  of  the  form  of  the  curve  in  figure  7, 
indicating  a  genuine  dissymmetry  in  the  distribution  of  frequencies,  it  would 
perhaps  be  better  to  adopt  the  abscissa  corresponding  to  the  maximum 
frequency,  as  the  most  plausible  value  of  the  unknown.     The  value  of  the 
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p.  E.  deduced  from  the  formula  may  be  taken  to  indicate  roughly  whether 
the  observations  are  closely  clustered  about  the  mean,  or  are  spread  over  a 
considerable  range.  But  this  p.  e.  cannot  be  regarded  as  having  the  meaning 
originally  attached  to  the  p.e.,  since  the  curve  representing  the  frequency 
distribution  is  not  of  the  form  of  the  Gauss  error  curve  represented  in 
fiofure  3. 


Exam2>le  6.     Glume-length  of  vjheat. 

Figure  8  shows  the  results  of  measuring  the  length  of  the  glumes  of 
595  individual  wheat  plants.  The  curve  shows  three  well-defined  maxima, 
^nd  this  fact  in  itself  would  arouse  suspicion  as  to  the  purity  of  strain  of  the 
wheat  measured.  The  plants  measured  were,  as  a  matter  of  fact,  the 
second  generation  from  a  cross  between  Rivet  wheat  (with  an  average  glume- 


9       11      13     15      17     19     21     23     25     27     29     31     33     35    37 
Length  of  glume  iu  mm. 

Fig.  8. 

length  of  9  mm.)  and  Polish  wheat  (with  an  average  glume-length  of  28  mm.). 
The  curve  shows  that  the  plants  examined  are  divided  into  three  well-defined 
groups,  one  resembling  the  short-glumed  Rivet  parent,  one  resembling  the 
long-glumed  Polish  parent,  and  the  third  intermediate  between  these  two  with 
an  average  glume-length  of  17  mm. 

The  application  of  the  ordinary  least-square  method  to  such  material  as 
is  here  represented  is  quite  meaningless.  This  particular  example  illustrates 
very  clearly  the  utility  of  commencing  the  discussion  of  a  number  of  obser- 
vations by  drawing  the  curve  of  frequencies.  Here  the  non-homogeneous 
nature  of  the  material  is  immediately  shown  by  the  curve :  while  in  cases 
like  those  considered  in  Examples  4  and  5  above,  the  curve  of  frequencies 
shows  the  presence  of  a  systematic  error. 
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EXAMPLES. 

In  the  following  Examples  calculate  the  p.  e.  from  both  Gauss's  and  Peters^ 
formula. 

1.  Evaluate  the  p.  e,  of  a  single  determination,  and  of  the  mean  value,  of 
the  variable  tabulated  on  page  6. 

2.  The  following  table  gives  12  determinations  of  the  azimuth  of  Allen 
from  Sears,  Texas.     (U.S.  C.  and  G.  Survey.     Publications,  No.  14,  p.  149.) 

98°  6'  41" -5 
42"-8 
43" -4 
43" -1 
39" -7 
42" -7 
41" -6 
43" -3 
40" -0 
45" -0 
43" -3 
40" -7 
Find  the  mean  value,  the  p.e.  of  a  single  determination,  and  the  p.e.  of 
the  mean. 

3.  From  the  following  15  independent  determinations  of  the  coefficient  of 
expansion  of  dry  air  (Rudberg,  Poggendorff^s  Aiinaleu,  41,  p.  271),  find  the 
A.M.  and  its  p.e.  : 

3-643x10-3  3-636x10-3  3-646x10-3 

54  51  3-662 

44  43  3-840 

50  43  3-902 

53  45  3-652 

25.     Probable  error  of  any  function   of  a   number  of 
independent  quantities  whose  probable  errors  are  known. 

Let  ??ii,  7)12,  ...,  Mn  be  n  independent  quantities,  whose  P.E.'s 
are  rj,  rg,  ...,  Vn,  and  let 

F=f{m^,  m^,  ...,  ??!,,) 
be  the  function  of  m^,  m^,  etc.  whose  P.E.  is  required. 

If  dF  be  the  error  in  the  value  of  F  produced  by  errors 
dm-^,  dnio,  etc.  in  the  values  of  m^,  m.2,  etc.,  then 

F  +  dF  =f(mi  +  dmi ,  m^  +  dm2,  ...,  mn-\-  dnin) 

=  /+  ^  dm^  +  ~-  dm^  +  . . .  +  ^  diiin. 
^      drill  dnu  dnin 
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The  error  in  F  is  thus  given  by 

,^      a/    ,  df   .  df    , 

(IF  =  ;r^  dm^  -\-  ^  dm^  +  ...-h--^    diiin 
dm^  oiUo  cnin 

=  ciidmi  +  a^drrio,  +  . . .  +  andrrin, 

where  a^  =  ^  ,      cto  =  ^ ,  etc. 

From  this  point  onward  the  problem  is  reduced  to  that  of  finding 
the  P.E.  of  a  linear  function  of  n  independent  variables. 
The  result  may  be  written 

^2  =  a-^^7\- ^a^^To^  -{-  ...,  etc., 
or  jjb-  =  a-^ fJL-c'  +  (i'liii  +  . . . ,  etc., 

where  cio  =  ^r^ . 

cms 

The  above  formulae  sometimes  break  down  in  practice,  when  the 
errors  are  not  small  enough  to  justify  the  neglect  of  the  square 
and  higher  terms  in  the  Taylor  expansion. 

EXAMPLES. 

1.  li.  X  and  y  be  the  sides  of  a  rectangle,  and  r^,  ^2  their  p.e.'s,  find  the 
P.E.  r  of  the  area  of  the  rectangle. 

Let  z  =  xy. 

Then  dz=ydx-\-xdy. 

Applying  the  formula  derived  above,  we  find 
r''-=y'^ri^  +  x'^r2^. 

2.  The  edge  of  a  cube  is  of  length  a,  and  its  p.e.  is  r.     Find  the  p.e.  of 
the  volume  of  the  cube.  Ans.     3a- r. 

3.  The  p.e.'s  of  the  three  edges  of  a  cube  are  rj,  r2,  r^.     Find  the  p.e.  of 
the  volume.  Ans.     a'^{ri^-{-rc^+r^)^ . 

4.  Given  that  the  p.e.  of  x  is  r,  find  the  p.e.'s  of  e~^,  log.r,  cos .37,  and 

5.  If  the  p.e.  of  a  single  reading  of  the  graduated  circle  on  a  meridian 
circle  be  r,  what  is  the  p.e.  of  a  result  based  on  the  readings  of  n  microscopes 

T 

placed  at  71  different  points  of  the  circle  ?  Atis.     — p . 

6.  Given  the  following  telegraphic  longitude  determinations,  find  the 
longitude  of  Moscow  east  of  Greenwich,  and  estimate  its  p.e. 

Potsdam— Greenwich  0^  52°=»  1 6^-05 1  ±0^ -003. 

Pulkowa— Potsdam  1        9         2 -491  ±0-003. 

Moscow— Pulkowa  0  28  58  •450±0 -010. 

B.  o.  4 
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7.  Given  the  sides  a,  6,  and  the  angle  C  of  a  triangle,  and  their  p.e.'s,  find 
the  p.E.'s  of  the  side  c,  and  of  the  area  of  the  triangle. 

(i)  c2  =  a2  4.52_2a6cos(7, 

cdc={a -  h cos  C)da-\-{h-a cos  C) dh  +  ab sin  CdC 

=  ccoB  B  da +  c  cos,  A  dh-\- ah  ^in.  CdC  ; 

dc  =  coBBda-\-co^Adh-\-a^m.BdC. 

Applying  the  formula  derived  above,  we  find 

r^ = Tc?  cos^  B  +  Vjj^  cos^  A  -^rf?  a^  sin^  B  sin^  1", 

where  r^  is  measured  in  seconds  of  arc. 

(ii)         A=-|a6sin(7, 

c?A      da      dh  ^      i.  n  in 

■ —  = H  -^  +  cot  CdC, 

^        a        b 

"J^'  =  ^'  +  ^'  +  cot2  Csin2 1",,  2. 
A^       a2       52 

8.  The  coefficient  of  expansion  of  a  rod  is  determined  by  measuring  its 
length  at  two  different  temperatures.  If  the  length  be  li  at  a  temperature  ti , 
and  I2  at  a  higher  temperature  t2,  the  coefficient  is  given  by 

_    h~h 

If  Tt  be  the   p.e.  of  a  temperature-reading,  and  r^  the  p.e.  of  a  length- 
determination,  find  the  p.e.  of  a. 

The  P.E.  of  I2  —  I1  is  >f2i^i,  and  p.e.  of  ^2~'^i)  ^''^r^.  Taking  logarithms  of 
both  sides  of  the  equation  for  a,  and  differentiating,  we  obtain 


whence  we  find 

da_d  {I2  -  h)      dl^      d  (^2  -  ^1) 
a           l2  —  l\           l\           ^2  ~  i\ 

a2       (^2-^l)2  ^h^\t2-hf 

2/1             2      \          2r<2 

"''^    W^(^2-W     '    (^2-0^' 

26.     Errors  due  to  separable  causes. 

When  the  accidental  errors  which  enter  into  a  measurement 
can  be  divided  into  a  number  of  separate  parts,  each  of  which  is 
independent  of  the  others,  and  has  a  known  M.S.E.  or  p.e.,  the 
M.s.E.  or  P.E.  of  the  whole  error  can  be  subdivided  into  the  same 
number  of  parts.  For  if  an  error  e  be  composed  of  three  separate 
parts  €1,  62,  €3,  all  independent  of  one  another,  we  have 

6  =  61  +  €2  +  ^3. 
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Then  it  follows  as  in  §  22  that 

and  ?'-  =  rj^  +  7\^  +  r^^. 

EXAMPLE. 

A  base  line  is  measured  by  successive  end  to  end  placings  of  a  rod,  and  is 
found  to  be  100  times  the  length  of  the  rod.  If  a  be  the  p.e.  of  the  assumed 
length  of  the  rod,  due  to  uncertainty  of  the  assumed  temperature,  and  b  be 
the  P.E.  of  the  end  to  end  placings  of  the  rod,  and  c  the  p.e.  of  the  settings 
at  the  extremities  of  the  base  line,  find  the  p.e.  of  the  assumed  length  of  the 
base. 

(1)  If  the  error  in  the  assumed  length  of  the  rod  be  x,  the  resulting  error 
in  the  length  of  the  base  will  be  100.r.  The  p.e.  of  the  length  of  the  base, 
having  regard  to  this  class  of  error  only,  will  be  100a. 

(2)  There  will  be  100  settings  of  the  rod,  and  the  p.e.  which  enters  at  each 
of  the  99  intermediate  points  will  be  b.  The  p.e.  of  the  length  of  the  base,  if 
this  be  the  only  kind  of  error,  will  be  \^99  b. 

(3)  The  p.e.  of  the  length  of  the  base,  due  to  the  end  readings,  will  be 

Thus,  if  r  be  the  total  p.e.  of  the  length  of  the  base, 
r2  =  1002a2  +  9962  +  2c2. 

27.  Total  Probable  Error  when  a  Systematic  Error  is 
present. 

The  case  where  the  total  error  in  an  observation  is  partly  due 
to  accidental  causes,  and  partly  due  to  systematic  causes,  follows 
naturally  from  the  discussion  of  the  last  paragraph. 

Let  r  =  P.E.  of  an  observation,  when  the  accidental  errors  only 
are  considered, 
Vq  =  the  mean  error  arising  from  the  systematic  causes, 
r  =  total  P.E.  of  such  an  observation. 

Then  /^  =  r'  +  r^", 

as  in  §  22. 

If  the  observation  be  repeated  n  times,  and  the  mean  taken 
the  final  p.e.  r'^  is  given  by 

n 

4—2 
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The  systematic  error  is  in  no  way  affected  by  the  repetition  of 
the  observation,  while  the  p.e.   due  to  the  accidental  causes  is 

T 

decreased  to  — ^. 

v/i 

The  correct  interpretation  of  the  last  equation  is  of  vital 
importance  in  all  applications  of  our  present  subject.  If  only 
accidental  errors  are  present,  and  if  r  be  the  p.e.  of  a  single  ob- 

T 

servation,  then  the  P.E.  of  the  mean  of  n  observations  is  —^ .     This 

expression  would  lead  one  to  expect  that  it  would  be  possible  to 
obtain  a  result  free  from  error  simply  by  increasing  the  number 
of  observations.  In  practice,  however,  it  is  impossible  to  eliminate 
all  traces  of  systematic  errors.  If  the  systematic  error  be 
represented  by  r^,  we  have  seen  above  that  the  total  P.E.  of  the 

mean  of  n  observations  is  a/  — I-  ^'o"-     It  i^^^y  happen  that  when 

n  is  relatively  small  r^^  is  negligible  in  comparison  with  — ;  but 
so  long  as  ?'o  is  finite,  however  small  it  may  be,  by  increasing  the 
value  of  n  we  shall  eventually  reach  a  stage  where  —  becomes 


negligible  in  comparison  with  r^^.  When  such  a  stage  has  been 
reached,  no  further  improvement  in  the  precision  of  the  mean 
value  can  be  attained  by  increasing  the  number  of  observations. 
Or,  to  put  this  in  other  words,  as  the  number  of  observations  is 
made  greater  and  greater,  a  stage  is  eventually  reached  beyond 
which  the  error  of  the  result  is  fixed  by  the  constant  or  systematic 
errors,  and  not  by  the  probable  accidental  error.  A  further 
increase  in  the  number  of  observations  produces  no  corresponding 
increase  in  the  accuracy  of  the  result,  unless  the  conditions  of 
observation  can  be  varied  from  time  to  time,  so  as  to  vary  the 
magnitude  and  sign  of  the  systematic  errors,  thus  causing  them 
to  appear  effectively  as  accidental  errors. 

The  P.E.  computed  from  the  residuals  is  independent  of  the 
presence  of  a  constant  error.  For  if  a  constant  quantity  be  added 
to  all  the  observed  values  in  a  series,  the  mean  value  is  altered 
by  the  same  amount,  and  the  residuals  (and  the  P.E.)  remain 
unchanged.    The  probable  error  is  a  measure,  not  of  the  deviations 
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of  the  observed  values  from  the  true  value,  but  of  their  deviations 
from  the  mean  of  an  infinite  number  of  observations. 

28.  The  Correction  of  Statistics  for  the  Effects  of  a 
known  Probable  Error  of  Observation. 

When  a  table  is  constructed  to  exhibit  the  distribution  of 
frequencies  of  successive  values  of  a  certain  measured  property 
whose  P.E.  is  known,  it  is  necessary  to  consider  the  effect  of  the 
probable  error  upon  the  table.  For  simplicity's  sake  we  shall 
consider  in  detail  the  problem  discussed  by  Eddington*  in  his 
treatment  of  this  problem. 

Suppose  we  have  a  table  giving  the  results  of  the  counts 
of  stars  between  given  limits  of  magnitude,  and  suppose  the  P.E. 

of  a  magnitude  determination  is  known.     Let  this  P.E.  be  0'477  j  , 

so  that  the  probability  of  an  error  oc  is  Ce~^'^\ 

Let  u{m)dm  =  observed  number  of  stars  between  magnitudes 
m  and  m  +  dm, 
V  (m)  din  =  true  number. 
Of  the  stars  whose  true  magnitude  is  between  {m  +  cc)  and 

{m-\-x  +  doc)  the  proportion  — =  e~^'^'dx  will  have  an  error  between 

Vtt 

—  X  and   —{x-\-  dx),  and  will  be   observed   as  of  magnitude  m. 
Thus  we  have 

h   r+=* 

II  {m)  =  -j=\       v{m  +  x)  e~^'^^dx. 
By  the  symbolic  form  of  Taylor's  theorem 

d 

X  -z — 

v(m  +  x)  =  e  ^"^v(m), 


and  therefore 

u  {m)  =  - 

'  TT  J  -oo 

Treating  this  integral  as  a  special  case  of 

a^      ^  4a2 ' 

*  Monthly  Notices,  R.A.S.,  Vol.  lxxiii,  pp.  359,  360,  from  which  the  whole  of 
this  discussion  has  been  taken. 


U    {711)    =    —r^      \  Q      am  y  (^yj^^   ^^ 

integral  as  a  special  c 

J    —X  V         ( 


54  THE   CASE   OF   ONE   UNKNOWN  [CH. 

we  find 


and  .(,.)  =  exp-(l,^^)^.(m) 


=  w  (m)  -  ^^  it^'m  +  -  (^^y  2.-  (771)  -  etc. 

When  the  p.e.  is  small,  it  is  sufficient  to  consider  the  first  and 
second  terms  only.  If  a  be  the  successive  intervals  of  magnitude, 
the  tabular  second  difference  is 

u (in  +  cl)  +  u {m  —  a)  ~  2n, (m)  =  a-u' (m)  approximately. 
For 

u(m-[-  a)  =  u  (7n)  +  olu  (m)  +  —  u"  (m)  approximately, 
and 

u  (m  -a)  =  u  {m)  -  aiif  (m)  +  -^  u"  [m) 

^  =  1'046  X  probable  error. 

Thus  the  approximate  correction  is 
'1*046  X  probable  error X^ 


,  T    1      .  ^        1        ,   X  tabular  second  difference, 
tabular  interval       / 

29.     The    Precision    of   the    Probable    Error    deduced 

from   r  =  0-6745  A/i^ . 
^   n-1 

In   §  23  we  derived  certain  expressions  for  the  residuals  of 

71  observations  in  terms  of  the  true  errors  of  the  observations. 

If  each  of  these  expressions  be  squared,  and  the  results  added 

together,  we  obtain  the  expression 

W  =  ^2  {^^  (^  -  1)  [^^]  -  2?^  [^set]], 

where  [egej  denotes  the  sum  of  the  products  of  all  possible  pairs 
from  among  the  ti  quantities  ei,  62, . . . ,  e^^. 
The  equation  may  be  written 

[vv]       [ee]  2 

If  we  formed   the    equation  for  a  large  number  of  samples 
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of  n  observations  from  among  a  very  large  number  of  observations, 
and  took  the  average  value  of  each  side  of  the  equation,  we  should 
obtain 

[vu] 


T  =  /^ 


mean  value  of  €s€t. 


n-1 

The  derivation  of  the  usual  formula  for  (j?  seems  therefore  to 
be  equivalent  to  regarding  the  mean  value  of  e^ff  as  zero.  That 
this  is  justifiable  can  easily  be  seen  from  §  20,  equation  (4'),  where 
it  was  found  that  the  mean  value  of  the  product  VgYt,  for  all 

possible  products  among  N  residuals,  is  equal  to  —  irr  _-,  -     When 

the  number  of  observations  N  is  sufficiently  large,  it  was  seen 
that  the  F's  could  be  regarded  as  the  true  errors  e,  so  that  we 
may  say  that 

the  mean  value  of  es€t  =  — 


N-r 

When  the  number  of  observations  N  is  very  large,  this  quantity 
becomes  vanishingly  small,  and  may  be  neglected.  Our  equation 
above  may  then  be  written 

n  —  1 
This  equation  yields  the  mean  result  for  a  large  number  of 
cases.     But  in  an  individual  set  of  n  observations  it  will  not  be 

y\  vv'\ 
■^^—A:  will  not  be  the  accurate  M.S.E.  of  an 

infinite  series  of  observations  of  which  the  given  n  observations 
form  a  random  sample. 

The  error  made  in   estimating  /x-  from  the  residuals  in  any 
particular  case  is 

[vv\         ,      [ee]  2  -,        ^ 

n—1      ^        n        n(n  — 1) 

The  square  of  this  error  is 

If  we  take   the   mean   value  of  this  expression  for  a  large 
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number  of  cases,  the  result  will  be  the  (m.s.e.)^  of  /x^  The  mean 
values  of  all  the  terms  in  the  expression  above  will  be  considered 
in  turn : 

[€6p  =  M  +  2[6/6r], 

and 

[e^]  =  7ix  mean  value  of  e^  =  -^  /     x^e-^'^^'dw  =  ^  =  3nu^ 

Vtt  Jo  ^h^ 

2  [es^ef\  =  2  X  ^^^^^~    ^  x  mean  value  of  e/e^^ 

therefore  mean  value  of    — -  =  —  -\ u^ 

n^         n  n 

Again 

V^s^tf  =  [^/^fl  +  terms  involving  first  powers  of  e. 
When  the  mean  value  is  formed  the  terms  involving  first  powers 
vanish,  therefore 

mean  value  oi 


n^{n-rf 


4             n(n-l)  ,         o     .   . 

X  ^ X  mean  value  of  e/ef 


n^in-iy  2 

The   mean  value  of  the  third  term  vanishes  on  account  of  the 
factor  [e^e^].     The  mean  value  of  the  fifth  term  is  —  2/^'^,  and  the 
sixth  term  vanishes  on  account  of  the  factor  [e^e^]. 
Thus  we  obtain  the  equations 


(m 


.S.E.)'  of  /A^  =-^  + ^4  ^  ^^      4  _      4 


2fJL' 


M.S.E.  of  u^  =  a-  .  / 

\     71  —  i 


Since  m.s.e.  of  fj,^  =  2/jux  m.s.e.  of  /x, 

we  finally  obtain  the  result 

M.S.E.  of  yL6  1  -707 


\/2('/i-l)     Vw-l 
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Also,  since  r  =  0*6745yu-,  we  have  the  relations 
M.s.E.  of  r       -707 


and 


r  \^n  -  1 

P.E.  of  r  _  -4769 

r        ~~  \'n  -  1 ' 


where  r  is  derived  from  the  formula  r  =  0'6 

•4769 


745yw. 

\    n  —  1 


The  following  table  gives  the  factor 
values  of  n. 


V?i-1 


for  some  different 


P.E.  of  r 

n 

r 

20  7o  error 

50  °/o  error 

5 

•288 

•64 

•24 

10 

•159 

•40 

•034 

15 

•127 

•29 

•008 

20 

•109 

•21 

•0002 

30 

•089 

•12 

•00014 

40 

•076 

•076 

8x10-6 

50 

•068 

•047 

6x10-7 

60 

•062 

•030 

5x10-8 

100 

•048 

•0050 

By  the  aid  of  this  table,  and  the  table  of  0(^)  given  in 
Appendix  I,  we  may  gain  some  idea  of  the  number  of  observations 
which  must  be  taken  in  order  to  yield  values  of  the  RE.  which  are 
deserving  of  confidence.  In  the  third  and  fourth  columns  above 
are  given  the  probability  that  the  P.E.  should  be  20  %  out,  and 
50  %  out,  respectively.  It  is  seen  that  with  10  observations  the 
odds  are  only  3  to  2  that  the  calculated  P.E.  is  within  20  %  of  the 
correct  value,  and  about  30  to  1  that  it  is  within  50  %  of  the 
correct  value.  The  smallest  number  of  observations  whose  P.E. 
shall  be  regarded  as  trustworthy  will  be  to  a  certain  extent 
a  matter  of  individual  opinion,  depending  upon  the  odds  which 
the  individual  is  prepared  to  regard  as  practically  equivalent  to 
certainty.  But  one  can  at  any  rate  regard  50  as  a  number  sufficient 
to  yield  a  fairly  reliable  P.E. 

When  the  number  of  observations  is  small,  say  10,  it  is  scarcely 
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- — J  as  the  P.E.  of  a 

single  observation.  It  only  yields  an  approximation,  which  cannot 
be  regarded  as  a  reliable  one,  to  the  value  of  the  P.E. 

There  is  another  aspect  of  the  question.  The  method  of  least 
squares  is  only  strictly  applicable  to  problems  where  the  distri- 
bution of  frequencies  can  be  represented  by  a  curve  of  the  general 
form  of  the  curve  shown  in  fig.  3.  The  problem  of  determining 
the  precision  of  the  measurements  is  equivalent  to  determining 
the  parameter  h  which  defines  the  exact  form  of  the  curve.  But 
the  value  of  h,  or  of  the  P.E.  r,  must  obviously  cease  to  have  its 
original  meaning  when  the  curve  of  frequencies  is  dissymmetrical, 
or  is  ill-defined  on  account  of  the  smallness  of  the  number  of 
observations.  An  examination  of  the  frequency  distributions 
shown  in  figs.  1,  4,  5,  6  and  7,  will  show  that  it  can  scarcely  be 
possible  to  gain  any  accurate  idea  of  the  true  form  of  the  frequency 
curve  for  a  large  number  of  observations  from  the  chance  distribution 
of  a  few  observations. 

It  is  only  when  the  number  of  observations  is  large,  say  50  or 
more,  that  the  P.E.  calculated  fi:'om  the  formula 


r  =  0-6745 


/  w 

V    7.-1 


can  be  regarded  as  a  reliable  measure.  If  we  use  this  formula  to 
calculate  r  for  a  small  number  of  observations,  say  10,  we  cannot 
expect  another  observer,  working  under  similar  conditions,  to 
obtain  the  same  value  of  r  for  similar  observations. 

When  the  number  of  observations  is  small,  or  when  the  curve 
of  frequencies  is  dissymmetrical  (as  in  fig.  7),  the  calculated  P.E. 
can  only  be  regarded  as  some  kind  of  measure  of  the  mutual 
agreement  of  the  observations  in  the  series,  a  small  P.E.  indicating 
that  the  disagreement  between  individual  observations  is  small. 

30.     A  Comparison  of  the  two  Formulae  for  r. 

It  has  been  shown  above  that  when  r  is  calculated  from  the 
squares  of  the  residuals,  the  P.E.  of  ?'  is  given  by 

P.E.  of  r  _    -707 
r       ~  \/n-l' 
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Helmert*  has  shown,  by  an  investigation  which  is  too  long 
and  complicated  to  be  included  here,  that  when  r  is  calculated 
from  Peters'  formula 

r  =  0-8453    ,     ""^        , 
Nn{n  -  1) 
the  M.s.E.  of  r  is  given  by 

M.s.E.  of  ?'         /  TT  -  2  '755 


/     TT-   ^ 

V  20^ 


{n-1)      s/n-\ 

Thus  the  p.e.  of  r  calculated  from  the  squares  of  the  residuals 
is  less  than  the  RE.  of  r  calculated  from  the  sum  of  the  residuals 
in  the  ratio  '707  :  755  or  1  :  1-07.  The  first  method  of  calculating 
r  is  thus  slightly  better  than  the  second,  but  the  difference  in 
their  precision  is  not  sufficiently  great  to  justify  the  entire  use 
of  the  first  formula  and  the  neglect  of  Peters'  formula.  Peters' 
formula  has  a  very  great  advantage  in  that  the  labour  involved  in 
its  use  is  by  far  less,  particularly  when  the  number  of  observations 
is  great.  And  it  is  generally  found  that  when  the  number  of 
observations  is  fairly  large,  and  the  curve  of  errors  is  symmetrical, 
the  two  formulae  for  r  yield  almost  identical  results.  The  two 
formulae  have  an  equally  strong  theoretical  basis,  and  when  the 
results  disagree,  it  is  because  the  frequency  distribution  does  not 
follow  the  normal  law;  and  in  this  case  the  method  of  least  squares 
should  be  applied  with  considerable  caution. 

The  following  practical  consideration  is  very  important.  In 
a  long  series  of  observations  it  often  happens  that  one  or  two 
observations  are  rejected  as  discordant.  The  retention  of  a  dis- 
cordant observation  generally  makes  a  considerable  difference  in 
the  value  of  V[v?;],  but  a  very  much  smaller  difference  in  the  value 
of  v'].  Thus  when  there  are  observations  which  we  are  doubtful 
about  retaining,  it  is  probably  better  to  use  Peters'  formula. 

*  Aatron.  Nachrichten,  Bd.  88,  No.  2096-7. 


CHAPTER   IV 

OBSERVATIONS    OF    DIFFERENT    WEIGHT 

31.     The  Weighting  of  Observations. 

We  have  hitherto  regarded  all  our  observations  as  having 
equal  precision,  or,  as  is  more  generally  said,  as  having  equal 
weight.  It  is  now  necessary  to  consider  how  our  formulae  must 
be  modified  when  the  observations  are  regarded  as  having  unequal 
weight,  and  consequently  as  having  unequal  importance  in  the 
determination  of  the  most  plausible  value  of  the  unknown.  The 
meaning  of  "  weight "  and  its  effect  upon  the  least  squares 
solution  can  be  most  clearly  seen  by  the  consideration  of  a 
simple  example. 

Let  Xi,  0C2,  a:..,  x^  be  four  observed  values  of  an  unknown 
quantity  x.     There  are  four  equations  of  condition : 

X      i^  Y  ^~  '■  1 , 

X  ^  Xo  =  Vo, 
X  X-j  =  v^, 
X  —  X^  =  Vi. 

The  most  plausible  value  of  the  unknown  is 

Jb-\        1"    Jbn    *r     Ju^j    ~T"    JUa 


Now  suppose  the  first  three  observations  to  be  grouped  together, 
yielding  a  value  x  {=    ^ — ^ ^j  of  the  unknown  for  that  group. 

Then  there  will  be  only  two  equations  of  condition : 

X  —  X  =  v' , 
X  —  X^=l\. 
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Since  x  is  the  mean  of  three  determinations,  while  x^  is  a  single 
determination,  we  may  say  that  x  has  a  weight  3,  while  ^4  has 
unit  or  standard  weight.  The  adopted  value  of  the  unknown  may 
now  be  written 

_  X^  +  X.->  +  ^3  +  A'4  _  Zx   +  ^4 

•^'  4  "  TTT  • 

If  A*'  and  ^^4  were  direct  observations  of  such  a  nature  that  it 
could  be  decided  from  any  consideration  that  three  observations 
such  as  x^  must  be  taken  and  combined  in  order  to  yield  a  result 
as  valuable  as  x ,  we  could  still  say  that  x  should  have  a  weight  3, 
while  x^  had  the  standard  or  unit  weight.  The  adopted  value  of 
the  unknown  would  still  be  written 

_  3^'  +  ^4 

^'""3+T- 

The  most  plausible  value  of  the  unknown  is  derived  by  multi- 
plying each  observed  value  by  its  weight,  adding  together  the 
products,  and  dividing  the  results  by  the  sum  of  the  weights. 
The  result  may  be  generalised  for  any  number  of  observations, 
with  any  assigned  weights.  If  x^,  Xo,  ...,  x^  be  n  observations, 
whose  weights  are  p-^,  |jo,  ...,  pn,  respectively,  the  adopted  value 
of  the  unknown  x  is 

^  ^  P^X^-\-p.2X.2-\-  ...  +PnXn  ^  [j)^'] 
Pi  +  P2+...  +  Pn  [p']    ' 

The  adopted  value  of  the  unknown  is  called  the  "weighted 
mean."  It  may  be  noted  in  passing  that  this  result  is  not 
altered  when  all  the  weights  are  increased  or  decreased  in  the 
same  ratio. 

In  the  simple  case  considered  above,  all  that  is  meant  by  the 
weight  3  assigned  to  x  is  that,  on  the  average,  three  observations 
of  unit  weight  must  be  combined  in  order  to  yield  a  result  as 
good  as  X.  Similarly,  in  the  general  case,  the  meaning  of  a 
weight  pr  assigned  to  an  observation  Xr,  is  that  pr  observations  of 
unit  weight  must  be  combined  in  order  to  yield  as  reliable  a  result 
as  Xr.  Later  on  we  shall  have  to  consider  the  different  methods 
of  assigning  weights  to  observations,  but  for  the  present  we  are 
only  concerned  with  the  modifications  in  the  methods  of  solution 
produced  by  the  difference  in  weights. 
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The  observational  equations  for  a  series  of  n  weighted  observa- 
tions may  be  written 

Xi  —  X  =  v^         weight  pi  \ 
Xo  —  X  =  V2  „      P2 

.(A). 


X^       ^  —  ^n  >j       Pn> 

Let  the  m.s.e.  of  a  hypothetical  observational  equation  of  unit 
weight  be  /jl.  Then  it  folloAvs  from  the  definition  of  weight,  that 
the  mean  values  of  the  residuals  are  given  by 

v^^  =  ^,    v.^=^,   etc., 
Pi        '      P2 

or  At-  =  7J>i  Vr  =  p2V2^  =  ...=  PnVn^. 

From  this  it  follows  that  the  above  equations  of  condition  are 
reduced  to  equations  of  equal  M.S.E.,  and  therefore  to  equations  of 
equal  weight,  when  each  equation  is  multiplied  by  the  square 
root  of  the  corresponding  weight.  The  system  of  equations  may 
then  be  written 

Vp,  (^1  -  x)  =  Vpi .  v^ 

\/p2{x2-x)  =  '^p2.vS  ....(B). 

etc. 

As  all  these  equations  have  equal  weight,  the  system  may  be 
solved  by  the  ordinary  method  of  least  squares.  The  value  of 
the  unknown  is  obtained  by  making  l^{pv'^)  a  minimum;  i.e. 
by  making 

i^i  (^1  —  ^y  +  P2 (^2  -  xy-\- ...  etc. 
a  minimum.     Differentiating  with  respect  to  x,  we  obtain  for  the 
value  of  the  weighted  mean 

i)i^i+_p2^2  +  •♦♦  _  [poc\ 
P1+P2  +  '"     ~~  [p]  ' 
which  agrees  with  the  value  23reviously  derived. 

The  equations  (B)  above  are  all  of  unit  weight,  and  therefore 
the  true  M.S.E.  /x  of  any  one  of  them  is  given  by 
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It  follows  that  if  ?'o  be  the  P.E.  of  a  single  equation  of  unit  weight, 

n=  0-0745  ./S3,    or    r.  =  0-84o3 -fcL  . 
\   n-1  \/n{n-  1) 

The  P.E.  of  an  observation  of  weight  p  is  -j=  ;    since   such  an 

observation  is  equivalent  to  p  observations  of  unit  weight  and 
P.E.  To*.  The  weighted  mean  is  equivalent  to  the  arithmetic 
mean  of  [;:>]  observations,  and  so  its  P.E.  r  is  given  by 

r  =7^  =  0-6745  ,/,4f^,. 

It  was  assumed  in  the  original  definition  of  w^eight  that  the  p's 
were  integral.  It  should  be  noted,  however,  that  the  values  of 
both  the  weighted  mean  and  its  P.E.  are  independent  of  the  actual 
values  of  the  weights,  and  depend  only  on  their  relative  values. 

32.     Methods  of  Weighting, 

(a)  Arbitrary  Scales. 
It  sometimes  happens  that  the  external  conditions  vary 
irregularly  during  a  series  of  observations,  in  such  a  way  that, 
although  the  effect  upon  the  separate  observations  cannot  be 
evaluated,  yet  the  observer  is  able  to  decide  that  some  of  the 
observations  are  less  affected  than  others.  In  such  a  case  it 
appears  legitimate  to  attach  greater  importance  to  the  less 
affected  observations.  This  is  done  by  assigning  to  these  observa- 
tions a  relatively  higher  weight  than  the  more  affected  observations, 
the  relative  values  of  the  weights  being  determined  by  the  observer 
according  to  some  arbitrary  scale  which  he  sets  up  for  himself. 
Thus  an  astronomical  observer  making  a  long  series  of  observations 
extending  over  many  nights  may  fairly  attach  different  weights  to 
the  results  of  the  separate  nights  according  to  the  steadiness  or 
unsteadiness  of  seeing.  When  once  these  weights  have  been 
assigned,  the  weighted  mean  and  its  P.E.  can  be  immediately 
evaluated  by  the  use  of  the  formulae  deduced  above. 

*  This  result  may  also  be  derived  as  follows.     Since  r^  is  the  p.e.  of  -JprVyy  the 

P.E.  of  Vj.  is    -         . 

s/Pr 
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The  greatest  disadvantage  of  this  method  of  weighting  lies  in 
its  arbitrary  and  personal  nature,  as,  in  general,  two  observers 
making  precisely  the  same  series  of  observations  would  not  assign 
the  same  relative  weights  to  the  separate  observations.  Further, 
a  computer  reducing  a  set  of  observations  may  legitimately  reject 
the  weights  assigned  by  the  observer  when  he  considers  that  the 
observations  are  more  affected  by  other  factors  not  considered  by 
the  observer,  than  by  the  factors  on  which  the  weights  are  based. 
Thus,  if  an  observer  assigns  weights  to  observations  of  the  moon 
according  to  the  steadiness  of  seeing,  the  computer  may  reject 
these  weights  if  he  finds  that  the  observations  are  affected  by  the 
inequalities  of  the  moon's  limb  to  a  greater  extent  than  by  the 
differences  in  seeing. 

It  is  impossible  to  give  any  rules  for  the  guidance  of  the 
inexperienced  observer  as  to  the  formation  of  a  scale  of  weights  to 
represent  the  varying  conditions  during  a  series  of  observations. 
His  safest  plan  is  to  observe  only  when  external  conditions  are 
fairly  stable,  and  to  assign  the  same  weights  to  all  his  obser- 
vations, unless  he  has  some  very  good  reason  for  doing  otherwise. 

The  general  tendency  of  most  observers  is  to  overdo  this  type 
of  weighting;  i.e.  to  regard  the  bad  observations  as  worse  than 
they  really  are.  "It  appears  that  the  longer  time  one  is  com- 
pelled to  bestow,  and  does  bestow,  upon  observations  made  under 
less  favourable  circumstances,  in  a  great  measure  compensates 
external  disadvantages ;  and  that  causes  of  errors  of  observation 
of  which  the  observer  himself  has  not  been  conscious,  often 
influence  him  no  less  than  those  which  obtrude  themselves  upon 
him*." 

(6)  Number  of  Ohservatioris  in  Grouped  Means. 
When  it  is  required  to  combine  a  number  of  quantities  each 
of  which  is  the  mean  of  a  group  of  observations,  nothing  being 
known  of  the  individual  observations  in  each  group,  each  group- 
mean  is  given  a  weight  proportional  to  the  number  of  obser- 
vations in  the  group.  The  weighted  mean  thus  obtained  is  clearly 
the  same  as  the  mean  of  all  the  observations.    But  the  P.E.  derived 

*  Ordnance  Survey.     Principal  Triangulation. 
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from  grouped  means  may  differ  slightly  from  that  derived  from  the 
actual  observations. 

(c)     Weighting  by  Probable  Error's*. 

Let  ^1,  ^o,  ...,  Xnhe  separate  determinations  of  a  quantity  x, 
where  the  p.e.  of  Xs{s=l,2, ...,  n)  is  known.  The  most  probable 
value  of  X  is  obtained  by  making 

X  hg'  {Xs  —  «)■-    a  maximum. 

This  is  equivalent  to  making 

Z ^    a  maximum. 

Differentiating  with  respect  to  x,  we  obtain 

1 


X  = 


[Ps] 


where  p^.  =  — 


This  result  may  be  interpreted  as  follows : 

If  it  is  necessary  to  combine  a  number  of  separate  determi- 
nations of  an  unknown  quantity,  where  the  P.E.  of  each  separate 
determination  is  known,  the  best  result  is  obtained  by  assigning 
to  each  determination  a  weight  inversely  proportional  to  the 
square  of  its  P.E. 

This  method  of  weighting  is  undoubtedly  the  best  when  it  is 
possible  to  obtain  trustworthy  P.E.'s  of  the  separate  determinations, 
the  number  of  observations  on  which  each  determination  is  based 
being  not  too  small.  When  the  number  of  observations  is  small 
there  is  always  a  danger  of  a  run  of  luck  causing  the  observations 
in  a  group  to  fall  close  together,  so  yielding  a  very  small  p.e.,  and 
consequently  a  very  large  weight.  In  such  a  case  the  computer 
must  decide  whether  the  small  P.E.  represents  a  true  P.E.  or  is 
small  simply  through  the  accidentally  close  agreement  of  the 
observations  in  a  group. 

*  In  all  that  follows  the  m.s.e.  /ul  may  be  iised  instead  of  the  p.e.  r,  as  the  basis 
of  weighting. 

B.  o.  5 
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The  following  description  of  a  practical  method  of  weighting, 
extracted  from  Wright  and  Hay  ford's  Adjustment  of  Observations 
(page  76),  is  particularly  instructive.  ''A  long-continued  series 
of  observations  will  show  the  kind  of  work  an  instrument  is 
capable  of  doing  under  favourable  conditions;  and  if  work  is 
done  only  when  the  conditions  are  favourable,  the  RE.  derived 
from  a  certain  number  of  results  will  generally  fall  within  limits 
that  can  be  assigned  a  priori.  For  example,  with  the  Lake 
Survey  primary  theodolites,  which  read  to  single  seconds,  the 
tenths  being  estimated,  the  work  of  several  seasons  showed  that 
the  mean  of  from  16  to  20  results  of  the  value  of  a  horizontal 
angle,  each  result  being  the  mean  of  a  reading  with  telescope 
direct  and  a  reading  with  telescope  reverse,  need  not  be  expected 
to  be  greater  than  0'''3.  If,  therefore,  after  having  measured 
a  series  of  angles  in  a  triangulation  net  with  these  instruments, 
the  P.E.'s  all  fell  within  +  0''-3,  it  was  considered  sufficient  to 
assign  to  each  angle  the  same  weight." 

No  general  rules  can  be  given  as  to  the  best  methods  of 
assigning  weights.  The  computer  must  take  into  consideration 
all  the  information  available  concerning  the  external  conditions 
at  the  time  of  observation,  the  possible  presence  of  constant 
errors,  the  type  of  instrument  used,  the  reputation  of  the  observer 
for  accurate  work,  as  well  as  the  computed  P.E.  With  all  these  in 
mind  he  must  use  his  own  judgment  as  to  the  best  method  of 
procedure.  The  inexperienced  computer  must  guard  against 
assigning  widely  divergent  weights  to  his  observations,  unless  he 
has  very  strong  grounds  for  so  doing. 

In  assigning  weights  according  to  RE.  it  is  necessary  to 
consider  the  possibility  of  a  systematic  or  constant  error  being 
present  in  the  set  of  observations,  causing  an  error  in  the  final 
result.  For,  since  the  P.E.  only  takes  account  of  the  accidental 
errors,  it  can  only  be  regarded  as  a  valid  measure  of  the  precision 
of  a  result  when  it  is  certain  that  the  result  is  not  affected  by 
systematic  errors.  Thus  in  an  attempt  to  combine  the  values 
of  the  solar  parallax  obtained  by  different  methods,  we  must 
consider  not  only  the  calculated  P.E.,  but  also  the  possible  sources 
of  systematic  errors.  Newcomb  states  that  the  errors  principally 
to  be  feared  in  a  determination  of  the  solar  parallax  are  not  the 
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accidental  errors  treated  by  the  method  of  least  squares,  but  the 
systematic  ones  arising  principally  from  personal  equation  and  an 
imperfect  reduction  of  the  observations  to  the  centre  of  the  planet, 
or  to  the  sun.  It  is  therefore  useless  to  assign  relative  weights 
based  on  the  P.E.'s  of  the  separate  determinations.  From  a  careful 
study  of  the  systematic  errors  entering  into  the  different  estimates 
of  the  solar  parallax,  Newcomb  assigned  the  weights  shown  in 
the  table  below\  The  weights  bear  no  very  close  relation  to  the 
calculated  p.e.'s. 


Parallax  in 
Seconds 


Weights 


Gill's  observations  of  Mars 

Contact  observations  of  transit  of  Venus 

Aberration  and  velocity  of  light  ... 

Parallactic  inequality  of  the  moon 

Minor  planets  (Gill) 

Leverrier's  method   ... 

Venus  ... 


8-780  ±0-020 
8-794  ±0-018 
8-798  ±0-005 
8-799  ±0-007 
8-807  ±0-007 
8-818  ±0-030 
8-857  +  0-022 


1 
1 
16 
5 
8 

0-5 
1 


33.  An  alternative  Method  of  Evaluating^  the  Precision 
of  the  Weighted  Mean  -when  Weighting  according  to 
Probable  Errors. 

It  has  been  shown  above  that  if  we  wish  to  combine  a  number 
of  determinations  of  any  quantity  when  the  P.E.  of  each  separate 
determination  is  known,  the  weight  to  be  given  to  each  deter- 
mination must  be  proportional  to  the  inverse  square  of  the  probable 


error.     If  Xi,  x^, 


be  the  7i  determinations,  whose  p.e.'s  are 


the  weight  ps  of  the  determination  a-^  is  given  by 


Ps 


The  weighted  mean  can  be  evaluated  by  means  of  the  equation 

'IM 


(I)- 


5—2 
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The  P.E.  of  the  weighted  mean  can   be   evaluated  from  the 
residuals  by  the  use  of  the  formula 

^=«<^^-yt=P^) (^)- 

But  there  is  another  way  of  approaching  the  question.     The 
weighted  mean  x,  defined  by  the  equation 

x^ 


1^ 

,,  2 
.    «  _ 

is  a  linear  function  of  n  quantities  x^,  X2,  etc.,  whose  P.E.'s  are 
known.     Hence  if  r  be  the  P.E.  of  x, 


1        1 

-+  — +  ... 


Ill  1  /ox 

or  -=  -+  —  +  ...+—     (:>)• 

The  results  derived  from  equations  (2)  and  (3)  will  in  general 
differ.  For  the  first  of  these  bases  the  calculation  of  r  upon  the 
differences  between  the  individual  determinations  x-^,  x.2,  etc.; 
while  the  second  method  neglects  entirely  these  differences.  There 
naturally  arises  the  question,  "  which  of  the  two  methods  of 
evaluating  r  is  the  better  ? "  The  answer  to  this  question  depends 
to  some  extent  upon  the  material  under  consideration.  If  the 
differences  between  the  individual  determinations  which  have  to 
be  combined  are  attributable  to  systematic  errors  entering  into 
different  determinations  in  different  ways,  it  is  clear  that  the  P.E. 
of  a  determination  can  give  no  clear  estimate  of  the  reliability  of 
that  determination.  In  such  a  case  equation  (3)  yields  a  value  of 
r  which  is  of  no  use  as  a  measure  of  the  reliability  of  the  result. 
We  are  then  forced  to  use  equation  (2).  If  the  number  of  deter- 
minations to  be  combined  be  small,  say,  3  or  4,  and  we  have  no 
reason  to  suspect  the  presence  of  systematic  errors,  it  is  better  to 
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use  equation  (3).     In  all  other  cases  it  is  safer  to  use  equation  (2) ; 
i.e.  to  calculate  the  P.E.  from  the  residuals. 


EXAMPLES. 

1.     Given  the  following  six  determinations  of  the  parallax  of  the  star 
Lalande  21185,  find  the  weighted  mean  and  its  P.E. 


Parallax 

Wei-ht  (p) 

1 

X 

px 

V      • 

V2 

pv^ 

0"-507 

8 

107 

856 

104 

10816 

86528 

•438 

5 

38 

190 

35 

1225 

6125 

•381 

2    ; 

-19 

-38 

-22 

484 

968 

•371 

8 

-29 

-232 

-32 

1024 

8192 

•350 

13 

-50 

-650 

-53 

2809 

36517 

•402 

20 

2 

40 

-1 

1 

20 

56    1 

166 

138350 

Let  the  parallax  of  the  star  be  0"*4  +  .^'"  x  10~^  Then  the  six  values  of  x 
are  given  in  the  third  column.  Each  value  of  .?;  is  multiplied  by  the  corre- 
sponding weight,  and  the  result  written  in  the  fourth  column.  The  weighted 
mean  gives  for  x, 

The  adopted  parallax  is  therefore  0""403. 

The  residual  v  obtained  by  subtracting  the  weighted  mean  from  x  is 
written  in  the  fifth  column,  v-  in  the  sixth  column,  and  pv^  in  the  seventh 
column.     The  sum  of  the  last  column  yields  [pvv]. 

The  P.E.  of  the  weighted  mean  is 

0-6745  y,^f3:     0-6745  J^  =  15-0 

in  units  of  the  third  decimal  place. 
The  fiual  result  may  be  written 

0"-403  +  -015. 
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2.     Find  the  weighted  mean  and  its  p.e.  for  the  following  determinations 
of  the  difference  of  longitude  between  two  places  : 


Weights  (;>) 

X 

px 

V 

^2 

pv^^ 

19m  is-42± 0^-044 

l^l 

-3 

-3  3 

-4 

16 

17-6 

•37  ±     -037 

1-6 

-8 

-12-8 

-9 

81 

129-6 

•38  ±    •ose 

1-7 

-7 

-11^9 

-8 

64 

108-8 

•45  ±     -036 

1-7 

0 

0 

-1 

1 

1-7 

•60  ±     •046 

1-0 

15 

15-0 

14 

196 

196-0 

•55  ±     ^045 

1-0 

10 

10-0 

9 

81 

81-0 

•57  ±     ^047 

1-0 

12 

120 

11 

121 

121-0 

9-1 

9^0 

655-7 

The  weights  of  the  determinations  are  proportional  to 

1  1  1  1  1  1  1 

(44)2'     (37)2'     (36;2'     (35)2'     (46)2'     (45)2*     (47)2 

Let  the  weight  of  the  last  determination  be  made  unity, 
others  become 

/47\2       /47\2       /47\2       /47\2       /47\2       /47\2 

W) '  V37/ '  V36; '   V36; '  \^q)  '  Usy ' 


Then  the 


1. 


1-7, 


1-7, 


1-0, 


1-0, 


1-0. 


or  1^1,  1-6, 

These  weights  are  written  down  in  the  second  column  of  the  table.  The 
longitude  difference  is  assumed  to  be  19'°  ls^45+^x0"^01.  The  separate 
determinations  of  x  are  written  in  the  third  column,  and  the  values  of  px  in 
the  fourth  column.    The  sum  of  the  fourth  column,  divided  by  the  sum  of  the 

9-0 
weights,  gives  — -  or  10  for  the  weighted  mean  of  x.     The  residual  v  is  x  —  \. 
y-  i 

The  rest  of  the  table  is  self-explanatory. 

The  P.E.  of  the  weighted  mean 


=  0-6745 


/  655-7  _ 
V  9-1  X  6 


23 


in  units  of  the  second  decimal  place. 

The  final  result  may  therefore  be  written 

19m  18.460  + 0^^023. 
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3.     From  the  following  determinations  of  the  parallax  of  61  Cygni  find 
the  weighted  mean,  and  its  p.e.  : 

0" 


•316  ±0" 

016 

•216± 

029 

•333  ± 

035 

•290  ± 

009 

•300  ± 

007 

•387  ± 

015 

•328  ± 

029 

•298  ± 

005 

•238  ± 

020 

•388  ± 

017 

[Give  each  determination  a  weight  proportional  to 


(P.E.)2 


and  evaluate 


P.E.  of  the  weighted  mean  from  the  residuals.] 

4.     An  angle  was  determined  in  three  separate  years,  the  following  results 
being  obtained  : 


149° 


16'  51" -48  ±0" -45 

48  -47  ±0  -28 

49  -72  +  0  -25 


Find  the  most  probable  measure  of  the  angle. 

[In  this  case  the  differences  between  the  individual  observations  are 
greatly  in  excess  of  the  p.e.'s.  There  are  obviously  systematic  errors 
present,  so  that  the  p.e.'s  do  not  represent  the  whole  error.  It  is  probably 
preferable  in  this  case  to  adopt  the  simple  a.m.  as  the  value  of  the  angle.] 

5.  If  the  probable  accidental  error  of  observing  a  star  is  0"*30,  and  the 
probable  quasi-systematic  error  of  a  gradation  is  0"^20,  how  would  you 
combine  7  observations  of  one  star  with  3  observations  of  another  star? 

Referring  to  page  51,  we  find  that 

(p.E.)2  of  mean  of  7  observations  =  (•20)2 -f|  (.30)2^ 
(p.E.)2  of  mean  of  3  observations  =  (-20)2+1  (.30)2^ 

6.  From   the   table   of  determinations   of  the  solar  parallax  given  on 

page  67,  using  the  weights  assigned  by  Newcomb,  show  that  the  weighted 

mean  is 

8"-802±0"-005. 

7.  The  difference  between  the  observed  values  and  the  assumed  value  of 
an  angle  is  given  by  the  following  set  of  determinations  : 

1"^27  +  0"^11  determined  from  4  observations 


1  -ooto  •lO 

5) 

»      3 

0  -91  ±0  ^07 

5) 

,      4 

1   •12±0  -IS 

•>■> 

,      6 

1  •30±0  •ll 

5» 

,      6 

1   ^4210  -19 

5» 

,      6 

1   -45  ±0  -15 

)1 

,      6 
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How  would  you  combine  these  values  to  obtain  the  most  probable  value 
(a)  when  the  above  probable  errors  are  given  simply  by  the  discordances 
between  the  individual  observations  of  each  group,  and  (6)  when  in  each  case 
the  probable  error  is  determined  from  a  long  series  of  which  each  member  of 
the  above  groups  is  a  sample  ? 

In  the  former  case,  determine  the  most  probable  value  of  the  angle  and 
its  probable  error. 

[In  the  first  case  the  observations  are  to  be  weighted  according  to  the 
number  of  observations  in  each  group,  while  in  the  second  case  the  p.e.'s 
must  also  be  taken  into  account.] 


MISCELLANEOUS  EXAMPLES   INVOLVING   ONE   UNKNOWN. 

1.  If  the  error  of  a  clock  determined  at  time 

Q^   is   a^'±ri, 
12^  is  a2*±^2J 
find  the  clock  error  and  its  p.e.  for  an  interpolated  time  8^^ 

2.  The  equatorial  velocity  of  the   sun,  determined  by  a  spectroscopic 
method,  yielded  the  following  results: 


Element 

Number  of  lines 

Km. /sec. 

Fe 

18 

1-857 

Ti 

8 

1-883 

Cr 

6 

1-883 

Sc 

5 

1-817 

Ca 

3 

1-845 

V 

3 

1-870 

Zr 

3 

1-915 

Mn 

2 

1-819 

Mg 

1 

1-989 

Ni 

1 

1-848 

Combine  these  values  {a)  giving  all  elements  equal  weight,  (6)  giving  each 
element  a  weight  equal  to  the  number  of  lines  measured. 

3.     2w+l  observations  are  made,  each  with  the  same  p.e.     Show  that  the 
probability  of  the  error  of  the  median  being  between  x  and  x-\-dx  is 


where 


h  [X  TO  J.? 


In  the  case  of  five  observations,  show  that  the  chance  of  the  error  of  the 
median  being  numerically  greater  than  r  is  -^{^ . 
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4.  The  probability  of  an  event  happening  once  in  one  trial  is  p,  so  that  in 
m  trials  the  event  happens  on  an  average  mp  times.  Find  the  p.e.  of  the 
number  mp. 

Let  q  be  the  probability  of  the  event  not  happening,  so  that 

q  =  l-p. 
The  whole  series  is  given  by 

{pJrqT'  =P'''  +  mp'''-^  q-^...  +  ?"'. 
This  may  be  interpreted  as  follows  :  The  frequency  of  the  event  happening 
m  times  is  p'"\  of  its  happening  (m-l)  times  is  7np'"^-'^q,  etc.  The  separate 
terms  of  the  Binomial  Series  give  the  frequency  distribution  of  the  different 
possible  numbers  of  successes  in  7n  trials.  Or  if  x  be  the  number  of  times 
the  event  occurs  in  m  trials,  we  have  the  following  frequency  distribution : 

x  =  m—l,  f=  mp'"'^  ~  1  q^ 

^    m  .m-\    ,     q    ., 
x  =  m-%         /=      -^    ^     p'''-~q\ 

etc., 
^'  =  0,  f=q"\ 

Taking  the  origin  at  x=m,  we  find, 

m.m— I    __n   „, 
mean  value  =  ?»p"^ '*-  q-\ -, — ^r —  p^    ^q^-t ... 

=  mq{p-\rqy''-~'^ 
—  mq. 
2/  {x  -  mf  =  mp'"'  -^q'\-2m{m-l)p'''-'^q-^  +  ...  +  m-  q''' 
=  mp»'-^  q  +  m  {m  -  1 ) p''' -^q^  +  ...  +  mq''"' 

+  7n  {m  -  l)p'''-'^  q-+...+m  {m  -  1}  q"" 

=  mq  +  m{m  —  \)q^. 
But  (M.s.E.)2  =  ^f{x  -  mpy^  =  2/ {x -  m)-  -  (mean  value)^ 

=  mq  +  m  (m  -  1 )  ^-  -  m'^q'^  =  mpq^ 
P.E.  of  mp  =-614:b^mpq. 

The  use  of  the  formula  will  be  perhaps  best  shown  by  a  simple  application. 
It  has  been  stated  that  in  the  British  Isles  the  proportion  of  male  to  female 
children  is  1050  to  1000.  Hence  the  probability  that  a  child  should  be  male 
is  |§f  B-     Of  100,000  births  the  number  of  male  children  should  be 

-Jgfx  100,000   or   51,219. 
The  probable  error  of  this  estimate  is 

x/l00,000  X  ig|  X  |g^   or    158  approximately. 

If  it  were  found  that  among  100,000  children  born  in  this  country,  51,500 

were  male,  the  deviation  from  the  expected  value  would  be  only  about  twice 

the  P.E.,  and  need  not  be  regarded  as  abnormal.     But  if  among  10,000,000 

children  5,150,000  were  male,  the  deviation  from  the  expected  value  would  be 
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28,100,  while  the  p.e.  of  the  deviation  would  be  1,580.  The  actual  deviatioa 
would  be  about  18  times  the  p.e,,  and  we  should  conclude  that  the  normal 
ratio  of  male  to  female  children  had  been  definitely  changed. 

5.  Given  that  the  p.e.  of  a  single  observation  is  -14,  how  many  observa- 
tions must  be  taken  and  combined  in  order  that  the  p.e.  of  the  mean  shall  be 
less  than  '02  ? 

6.  A  series  of  100  observations  of  an  angle  gives  for  the  p.e.  of  a  single 
observation  l'-75.  What  is  the  probability  that  the  error  of  the  mean  is  not 
greater  than  '25  ? 

7.  An  angle  is  measured  100  times,  and  the  p.e.  of  a  single  observation 
is  l'*75.     How  many  errors  will  be  greater  than  0''25  and  less  than  1''25  ? 

8.  If  the  P.E.  of  a  single  observation  is  1*5,  how  many  observations  must 
be  combined  in  order  that  the  odds  may  be  3  to  1  that  the  mean  is  within 
•25  of  truth  ? 


CHAPTER  V 

THE  GENERAL  PROBLEM  OF  THE  ADJUSTMENT  OF  IN- 
DIRECT OBSERVATIONS  INVOLVING  MORE  THAN  ONE 
UNKNOWN    QUANTITY 

34.  In  the  cases  hitherto  considered,  the  problem  has  been 
to  find  the  most  probable  value  of  an  unknown  quantity,  given 
a  number  of  direct  observations  of  that  quantity.  The  arithmetic 
mean  was  adopted  as  the  best  value  of  the  unknown.  We  must 
now  consider  the  case  where  the  quantity  measured  is  not  itself 
the  unknown  whose  value  is  required,  but  is  expressible  as  a 
function  (not  of  necessity  linear)  of  a  number  of  unknown 
quantities.  The  problem  may  be  briefly  stated  thus :  Given  a 
number  of  measurements  of  certain  functions  of  a  number  of 
unknowns,  to  find  the  values  of  the  unknowns,  and  their  probable 
errors. 

Let  the  n  observed  quantities  be  i/j,  M^,  ...,  Mn\  and  let  their 
unknown  errors  be  v-i_,  v^,  ...,Vn.  Then  it  is  given  that  ilfi  +  ?;i, 
J/2  +  V2J  etc.  can  be  accurately  expressed  as  functions  of  the 
unknowns  X,  F,  Z,  etc., 

f\{X,Y,Z,...)  =  M,  +  v, 

MX.  Y,Z,...)  =  M,  +  v,,_    ^^^ 


There  will  be  one  equation  of  this  form  for  each  observation ;  and 
in  the  problems  with  which  we  shall  have  to  deal,  the  number  n 
of  equations  in  (1)  will  be  greater  than  the  number  vi  of 
unknowns. 
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Now  suppose  approximate  values  of  the  unknowns  to  be 
known,  or  to  have  been  deduced  by  solving  a  sufficient  number  of 
the  equations  in  (1).  Let  these  approximate  values  be  Xq,  Yq,  Zq, 
etc.,  and  let 

X  =  Xo-}-aj,     F=Fo  +  3/,     ^=^0  +  ^,  etc., 

where  it  may  be  assumed  that  the  corrections  x,  y,  z,  etc.  are 
small,  so  that  their  squares  may  be  neglected. 
The  first  equation  in  (1)  may  then  be  written 


/l   (Xo,    1  0,    Zq, 

Let 


■^^^m^ylh^'h--^'^^^^- 


dX, 


=  a. 


a,. 


Ml 


6i,  etc., 


-M,+f,{X„  Y„Z„...)=h, 


Then  equations  (1)  may  be  written 

a^x  4-  h^y  +  c^z  +  . . .  —  li  =  Vi 
aocc  +  Ky  +  C2Z  +  ...  -  L  =  v^ 


.(2), 


where  the  as,  6's.  c's,  /'s,  etc.  are  known. 

Equations  (1)  or  (2)  are  called  the  "observational  equations." 
The  problem  has  now  been  reduced  to  the  case  where  the  equations 
are  all  linear.  If  the  values  of  x,  y,  z,  etc.  obtained  by  solving 
equations  (2)  be  small,  we  may  rest  content  with  our  solution ; 
but  if  some  of  them  should  be  large,  it  may  be  necessary  to  repeat 
the  solution,  taking  the  results  of  the  first  solution  as  approximate 
values  of  X,  Y,  Z,  etc.  It  is  found  in  practice  that  the  approxima- 
tion step  saves  considerable  labour  even  in  cases  where  the  original 
observational  equations  are  strictly  linear. 
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35.     Formation  of  the  N'ormal  Equations. 

If  the  observational  equations  all  have  the  same  weight,  or  are 
liable  to  the  same  mean  error,  the  same  discussion  as  for  one 
unknown  quantity  will  apply  (see  §§13  and  14).  The  probability 
of  the  occurrence  of  a  residual  v  may  be  written 

The  probability  of  the  coexistence  of  the  system  of  residuals 
Vi,  V2,  ..',  Vn  may  be  written 

The  most  probable  values  of  the  unknown  will  be  such  as 
to  make  this  probability  a  maximum.  The  expression  is  greatest 
when  [vv]  is  least,  and  so  the  most  probable  values  of  the  unknowns 
are  given  by  the  condition  that 

[vv]  —  a  minimum, 


r  —  n 


or  2)    (ayX  +  Ky  +  . . .  —  /,.)-  =  a  minimum. 

The  conditions  for  a  minimum  are  obtained  by  equating  to  zero 
the  differential  coefficients  of  this  expression  with  respect  to 
^,  y,  etc., 

rti (a^x  +  61?/  +  . . .  -  Zi)  +  a^ (a.,w  -{-h.i/ +  ...  -  lo)+  ...  =  0\ 
h^{a,x  +  b,y  -\-  ...-  I,)  +  h  («2'^'  -\- b.,y  +  ...  -  L)-\-  ...  =  0 


^(3). 


Collecting  coefficients,  we  may  write  these  equations  in  the  form 
[aa]  X  +  [ah]  y  -f  [ac]  z+  ...-  [al]  =  0  =  f ; 
[ah^x  +  lhh]  y  +  [hc]  ^  +  ...  -  [6/]  =  0  =  r;| 
[ac]  ^4-  [M  y  +  M  z+...-  [c/]  -0  =  t)     (4). 


These  equations  are  called  the  normal  equations.  There  is 
one  normal  equation  corresponding  to  each  unknown,  and  our 
problem  has  therefore  been  reduced  to  that  of  solving  a  set  of 
linear  equations  whose  number  is  the  same  as  the  number  of 
unknowns.     The  solution  of  the  normal  equations  gives  the  most 
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probable  values  of  the  corrections  x,  y,  z,  etc.,  and  from  these 
corrections  the  values  of  the  original  unknowns  X,  Y,  Z,  etc.  can 
be  immediately  deduced. 

Since  the  coefficients  of  the  normal  equations  are  symmetrical 
about  the  principal  diagonal,  it  is  convenient  to  write  the  normal 
equations  in  the  following  abbreviated  form,  with  half  the  cross 
products  omitted, 

\aa\  X  +  [a6]  y  4-  \}^c\  z  -^  ...  -  \al'\  =  Ov 
+  [66]2/+[6c]^  +  ...-M=0 

+  [cc]5  +  ...-[d]  =0V   (5). 


There  are  m  normal  equations  when  the  number  of  unknowns 
X,  Y,  Z,  etc.  is  m.     The  total  number  of  products  to  be  estimated 
is  \m{m-{-  3),  made  up  of  m  products  [aa\  \hh\  etc.,  m  products 
[aV],  \hl],  etc.,  and  ^m(??^  —  1)  products  [a6],  [6c],  [ca\,  etc. 
Equations  (3)  may  be  written 

[av]  =  0  =  [hv]  =  [cv'\  = (6), 

corresponding  to  the  relation  [y^  =  0  obtained  in  the  case  of  one 
unknown  quantity.  These  relations  afford  a  check  upon  the  values 
of  Vj,  V2,  etc.  obtained  from  the  solution  of  the  equations. 

If  the  observational  equations  are  liable  to  different  mean 
errors,  or  in  other  words  have  different  weights  p^,  p2,  etc.,  the 
equations  (2)  above  must  be  multiplied  by  Vp^,  "Jp^,  etc.,  respec- 
tively, so  as  to  reduce  them  to  equations  of  equal  weight.  Save 
for  this  multiplication  it  is  not  permissible  to  multiply  any  ob- 
servational equation  by  an  arbitrary  factor.  It  is  then  necessary 
to  make 

\^pvv]  a  minimum. 

This  condition  leads  to  the  set  of  normal  equations 

[^aa]  X  -^Ypab']  y  -\-  \_pac'\  z  -\- . . .  —  [paZ]  =  0\ 
[pab]  x  +  [pbb']y-\-  [pbc'\  z -\-  ...-[pbl]  =  0 


with  the  conditions 


\_pav\  =  [phv]  =  ...  =  0 


••(4'), 


.(6'). 


^] 


MORE   THAN   ONE   UNKNOWN 


79 


If  we  slightly  revise  our  notation  so  that 

[aa]  stands  for  ^pa^, 

[ab]  „        „    Ipab, 

etc., 

then  equations  (4^)  and  (6')  are  included  in  equations  (4)  and  (6), 
and  the  same  method  of  solution  can  be  applied  in  each  case.  In 
all  that  follows  we  shall  take  account  of  possible  differences  in 
weight  by  regarding  [aa],  [ah],  etc.  as  Xpa^,  Xpab,  etc. 


36.     Independence  of  the  Normal  Equations. 

If  the  normal  equations  are  not  all  independent,  it  will  be 
possible  to  derive  any  one  of  them  by  combining  some  or  all  of 
the  others,  and  the  solution  of  equations  (4)  will  not  lead  to  a 
determinate  solution  for  x,  y,  z,  etc.  In  this  case  it  must  be 
possible  to  find  m  constants,  the  ratios  of  /,  g,  h,  . . . ,  k,  so  that 

where  f,  77,  f,  etc.  are  written  for  the  normal  equations  (4),  and  m 
is  the  number  of  unknowns  x,  y,  z,  etc. 

Collecting  the  coefficients  of  x,  y,  z,  in  the  last  equation, 
we  find 


f[ab]+g[bb]  +  ...^0 


•in 


and  finally,  f[<^U  +  9  [^^]  +...  =  —  A;^ 

Multiplying  the  first  m  of  these  equations  by  /,  g,  h,  etc.  respec- 
tively, and  adding  the  products,  we  find 

Y(far  +  gbr+...y=0, 


or 


fa,-\-gb,-\-...  =  0\ 
fa2-\-gb2-\-...  =  0 


.(8). 


There   will   be   one    of   these   equations   for   each    observational 
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equation.  Multiplying  equations  (8)  by  l^,  l^,  etc.  and  adding, 
we  find 

f[al]  +  g[bl] ^...=0  =  -k  from  (7). 

It  follows  from  equations  (8)  that  there  cannot  be  more  than 
(m— 1)  independent  observational  equations;  for  equations  (8) 
will  yield  a  relation  between  the  coefficients  of  any  m  of  these 
equations.  The  problem  of  solving  for  x,  y,  z,  etc.  is  then  inde- 
terminate from  the  beginning. 

The  above  discussion  shows  that  if  the  normal  equations  are 
not  independent,  there  could  not  have  been  m  independent 
observational  equations.  We  infer  that  if  there  are  at  least  m  inde- 
pendent observational  equations,  there  will  result  m  independent 
normal  equations,  the  solution  of  which  yield  determinate  values 
of  the  m  unknowns.  This  may  be  otherwise  stated  as  follows, — 
If  there  are  sufficient  observational  equations  to  determine  or 
over-determine  the  solution,  the  normal  equations  yield  a  deter- 
minate solution. 

The  condition  of  independence  is  in  general  satisfied  in  the 
problems  which  arise  in  practice.  We  can  then  proceed  to  the 
formation  and  solution  of  the  normal  equations. 

37.     Checks  on  the  Formation  of  the  Normal  Equations. 

Before  the  normal  equations  can  be  written  in  a  complete 
form,  it  is  necessary  to  compute  the  products  [aa],  \bh\  \oh\  \_al\ 
etc.  It  is  necessary  to  check  the  formation  of  these  products. 
The  most  convenient  form  of  check  is  the  following. 

Let  a^  +  6^  + ...  +  ^,.  =  5^. 

Then  multiplying  each  5-equation  by  the  corresponding  a,  and 
adding,  we  find 

[art]  +  [a6]  +  . . .  +  [a/]  =  \as\. 
And  similarly 

[aq4-[66]4-...  +  [6/]  =  [H 
etc. 

Each  of  these  equations  is  a  check  on  the  sum  of  the  coefficients 
of  one  of  the  normal  equations.  The  calculation  of  \as\,  \bs\  etc. 
yields   a   double    check    on    each    of    the    cross-products.     The 
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additional  work  involved  is  very  slight.  We  evaluate  Sr  for  each 
of  the  observational  equations,  and  in  the  subsequent  work  treat 
Sr  in  the  same  way  as  any  other  coefficient. 

The  work  of  computing  the  products  is  carried  out  with  the 
aid  of  tables  of  squares,  of  products,  or  of  logarithms,  or  with 
an  arithmometer,  a  machine  for  performing  multiplication  and 
division.  If  the  work  is  to  be  carried  on  by  the  aid  of  tables  of 
squares,  the  products  [ab]  etc.  can  be  derived  as  follows : 

(a  +  by  =  a-  +  b'  +  2ab  ; 

.  •.  2  [ab]  =  [(a  +  bf]  -  [aa]  -  [bb]. 

Since  [aa]  and  [bb]  have  to  be  evaluated  in  any  case,  the  evalua- 
tion of  [ab]  is  simply  replaced  by  the  evaluation  of  [(a  +  bf]. 

In  any  extensive  piece  of  work  involving  a  large  number  of 
observational  equations,  or  a  large  number  of  unknowns,  it  is  a 
great  saving  of  labour  to  carry  on  the  work  in  a  fixed  form,  so 
that  the  whole  of  the  work,  including  the  application  of  the 
checks,  shall  be  as  uniform  and  mechanical  as  possible.  It  will 
also  be  necessary  to  have  a  system  of  checks  at  different  stages  in 
the  process  of  solving  the  normal  equations,  so  that  any  arith- 
metical error  shall  be  immediately  detected. 

A  great  deal  of  labour  can  be  saved  in  some  cases  by  selecting 
new  units  of  the  unknowns,  in  such  a  way  that  the  coefficients  in 
the  equations  shall  all  be  of  the  same  order  of  magnitude. 

38.  It  is  sometimes  possible  to  effect  a  considerable  simplifi- 
cation of  the  work  of  solution  by  means  of  a  simple  substitution ; 
e.g.  when  the  observational  equations  are  of  the  form 

ax  -f-  bxy  =  I, 

the  solution  is  facilitated  by  the  substitution 

u  =  xy. 

The  observational  equations  then  become 

ax  -i-bu  =  I, 

and  the  solution  can  be  carried  out  in  the  usual  way.  The  p.e.  of 
an  observational  equation  of  unit  weight  can  be  calculated  from 
the  residuals,  and  Tu  and  r^  deduced  by  the  ordinary  method.  (See 
next  chapter.) 

B.  o.  6 
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Then,  since 


7       du 


dx. 


. +:;3'•x'  =  ^+^'-^^ 


X' 


or 


ry  =  -yru'  +  y'r^\ 


This  method  of  introducing  a  new  unknown  dispenses  with  the 
approximation  which  would  otherwise  be  necessary  in  the  first 
step  of  the  solution,  by  reducing  the  equations  of  condition  to  a 
linear  form. 

39.     The  Formation  of  the  Normal  Equations. 

Example  1.     Given  the  four  equations*, 

X-  y  +  ^z-   3  =  0, 

3^  +  2y-5s-    5  =  0, 

4:X+  y  +  42-21=0, 

-A'  +  3y +  3^-14  =  0, 

it  is  required  to  find  the  most  probable  values  of  x^  y,  z. 

If  we  solve  the  first  three  of  these  equations,  we  find 

x  =  ^,      y  =  '^,      2  =  lf. 
Substituting  these  values  in  the  last  equation,  we  find 
-^  +  3j/  +  32-14=-f, 
and  so  the  fourth  equation  is  not  satisfied.     We  therefore  proceed  to  form 
the  normal  equations. 

The  formation  of  the  normal  equations  is  best  carried  out  in  tabular 
form.  The  coefficients  of  the  observational  equations  are  first  written  down, 
and  the  products  formed  in  turn. 

a  b  c  I  s 

1-12  3  5 

3  2-5  5  5 

4  1  4  21  30 
-1                    3                    3                  14  19 


aa 

ab 

ac 

al 

as 

1 

9 

16 

1 

-1 
6 
4 

-3 

2 

-15 

16 

-3 

3 

15 

84 
-14 

5 
15 

120 
-19 

27 

6 

0 

88 

121 

*  Gauss,  Theoria  Motus,  §  184. 
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ab 

bb 

be 

bl 

bs 

1 

—  2 

-3 

-5 

4 

-10 

10 

10 

1 

4 

21 

30 

9 

9 

42 

57 



6 

15 

1 

70 

92 

ac 

be 

cc 

el 

es 

4 

6 

10 

25 

-25 

-25 

16 

84 

120 

9 

42 

57 

0 

1 

54 

107 

162 

al 

bl 

cZ 

11 

9 

25 

441 

196 

Is 

15 

25 

630 

266 

88 

70 

107 

671 

936 

Checks  are  applied  to  all  the  sums  as  they  are  formed. 


The  normal  equations  may  then  be  written 

27.r+  6y  =  88, 

6^+153/+     z=   70, 

y  +  545=107. 

No   special  method  is   necessary  for  the  solution   of  these   equations. 
Substituting  for  x  and  z  in  the  second  equation,  we  find 


%)  +  15^  +  5^(107-y)  =  70, 


giving  3/ =  3-55. 

Substituting  this  value  for  3/  in  the  first  and  third  equations,  we  find 

.r=2-47,       3/  =  3-55,       2=1*92. 
The  discussion  of  the  p.e.'s  of  .r,  y,  z  will  be  found  on  page  107. 
The  quantity  [II]  evaluated  above  is  not  essential  to  the  formation  of  the 
normal  equations,  but  may  be  required  later,  in  the  discussion  of  p.e.'s. 

6—2 
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Example  2.     Given  a  set  of  14  observational  equations  of  the  form 

ax  +  hy  =  ?, 

with  weight  jo,  the  values  of  a,  6,  ?,  and  p  being  given  in  the  table  below, 
form  the  normal  equations. 


a 

6 

I 

'     1 

V 

0-47 

1-38 

-0-68 

1-17 

5 

1-43 

0-02 

0-75 

0-80   , 

4 

0-15 

-1-06 

0-60 

-0-31  \ 

3 

-0-98 

-0-69 

0-10 

-1-57 

3 

-0-82 

-1-20 

1-05 

-0-97   ; 

4 

-0-88 

-1-25 

-1-00 

-313   1 

3 

-1-58 

-0-20 

-0-32 

-2-10 

5 

-1-12 

0-62 

-0-27 

-0-77 

3 

-1-27 

-0-17 

1-47 

0-03   ; 

3 

-1-10 

0-36 

-0-06 

-  0-80 

3 

-1-14 

1-19 

-0-36 

-0-31 

5 

-0-96 

0-62 

-0-36 

-0-70 

3 

-0-13 

1-58 

0-02 

1-47 

5 

•0-62 

100 

0-50 

2-12 

3 

We  have  to  evaluate  [joaa],  \^pah\  [pa7],  etc. 

The  first  step  is  to  I'ewrite  the  above  table,  with  each  Hne  multiplied  by 
the  corres23onding  weight. 


pa 

ph 

pi 

ps 

2-35 

6-90 

-  3-40 

5-85 

5-72 

0-08 

3-00 

3-20 

0-45 

-3-18 

1-80 

-0-93 

-2-94 

-207 

0-30 

-4-71 

-3-28 

-4-80 

4-20 

-3-88 

-2-64 

-3-75 

-3-00 

-9-39 

-7-90 

-1-00 

-1-60 

- 10-50 

-3-36 

1-86 

-0-81 

-2-31 

-3-81 

-0-51 

4-41 

0-09 

-3-30 

1-08 

-0-18 

-2-40   I 

-5-70 

5-95 

-1-80 

1-55 

-2-88 

1-86 

-1-08 

-210 

-0-65 

7-90 

0-10 

7-35 

1-86 

3-00 

1-50 

6-36 
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paa 

pah 

pal 

pas 

1-105 

3-243 

-1-598 

2-750 

8-180 

•114 

4-290 

12-584 

•068 

--477 

-270 

--139 

2-881 

2-029 

-•294 

4-616 

2-690 

3-936 

-  3-444 

3-182 

2-323 

3-300 

2-640 

8-263 

12-482 

1-580 

2-528 

16-590 

3-763 

-2-083 

-907 

2-587 

4-839 

-648 

-5-601 

--115 

3-630 

-1-188 

-198 

2-640 

6-498 

-  6-783 

2-052 

1-767 

2-765 

-1-786 

1-037 

2-016 

•084 

-1-027 

--013 

--955 

1-153 

1-860 

-930 

3-943 

52-461 

3-366 

3-902 

59-729 

pah 

phb 

phi 

phs 

9-522 

-4-692 

8-073 

-002 

-060 

-176 

3-371 

-1-905 

-986 

1-428 

--207 

3-250 

5-760 

-5-040 

4-656 

4-688 

3-750 

11-738 

-200 

•320 

2-100 

1153 

--502 

-1-432 

•087 

--750 

--016 

-389 

-•065 

--864 

7  081 

-2-142 

-1-844 

1-153 

--670 

-1-303 

12-482 

•158 

11-613 

3-000 

1-500 

6-360 

3-366 

50-316 

-  10-188 

43-493 
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pal 

phi 

pll 

pis 

2-312 

-  3^978 

2-250 

6-600 

1-080 

-  ^558 

•030 

-•471 

4-410 

-  4-074 

3-000 

9-390 

•512 

3-360 

•219 

•624 

6^483 

•133 

•Oil 

•144 

•648 

•558 

•389 

•756 

•002 

•147 

•750 

3^180 

3-902 

-10-188 

22-096 

15^811 

The  normal  equations  are  therefore 

52^461^+   3^366y-    3^902  =  0, 
3-366.r  + 50-316?/  + 10-188  =  0, 
and  may  easily  be  solved  by  ordinary  algebraic  methods. 

The  quantity  \j)ll]  evaluated  here,  and  [II]  in  the  previous  example,  will 
be  required  in  evaluating  the  p.e.'s  of  the  unknowns.  It  is  convenient,  though 
not  necessary,  to  evaluate  [II]  or  [pU]  when  forming  the  normal  equations. 

Example  3.     A  quantity  I,  of  the  form 

X  cos  6  +y  sin  6+z  =  l^ 
is  observed  for  ^  =  5°,  15°,  25°,  etc.,  up  to  <9  =  355°. 
Form  the  normal  equations. 

With  the  previous  notation,  a=cos^,  6  =  sin^,  c=l.     Since 
cos2^  =  sin2(90°-^), 

355°  175°  85° 

[aa]=  2  cos2^  =  2  2  cos2(9  =  4  2  cos2<9  =  18, 


[bh]: 


5° 
355° 

2  sin2<9  =  18, 

5° 

[co]  =  36, 

[ah]  —  2  cos  ^  sin  ^  =  0, 

[ac]  =  2  cos^  =  0, 

[6c]  =  2  sin  (9  =  0. 
Hence  the  normal  equations  are 

18^'  =  [aZ]  =  2Zcos^, 
18y=[6?]  =  2^sin(9, 
36s  =  [c?]=2^. 
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Example  4.  The  instants  of  passage  of  twelve  successive  swings  of  a 
pendulum  are  observed.     Find  the  period. 

Let  the  observed  times  of  the  twelve  successive  passages  be  t^^  ti,  /?2v^ii- 
Let  the  true  time  of  the  first  passage  be  ao  and  let  the  period  be  T.  Then  we 
have  twelve  observational  equations, 

ao-tQ  =0    \ 

ao+7'-h   =0 
ao+2T-to=:0     \ 


The  normal  equations  are 

12ao  +  (l +  2  +  3+. ..  +  11)  2^- (^0+^1  +  ^2+. ..+^^ii)  =  0, 
(1 +  2+3  +  ... +  11)  ao+ (1^  +  22  +  32  +  . ..  +  ll2)^-(^i  +  2i;2+..-  +  llJ^ii)  =  0. 

If  we  substitute 

^0  +  ^1  +  ^2  +  •  •  •  +  ^11  ^^  ^1 5 
^i  +  2^2+---  +  ll^n  =  '52, 
the  normal  equations  become 

12ao+  667^-51  =  0, 
66^0 +  506  7^-52  =  0. 
Solving  by  the  ordinary  algebraic  method,  we  find 

^    2^2-11^1  ^^^23^1-3^2    ^2). 


286      '  "  78 

The  following  method  of  dealing  with  the  above  problem  is  entirely  ^vrong, 
but  afords  an  i7iteresting  example  of  the  errors  one  is  liable  to  commit. 

Let  the  times  be  measured  from  the  instant  of  the  first  passage,  and  let 

the   times  of  the   other  passages  be  t^,  t^.-.-t^^.     Then  the  observational 

equations  are 

ti-T  =0 

#2-27^=0 


^11-117^=0' 
These  equations  are  of  equal  weight,  and  the  period  is  the  value  of  T 

which  makes 

2{ts-sTY  a  minimum. 

Differentiating  with  respect  to  T,  we  find 

ti  +  2t2  +  ...  +  ntn  .oy 

12  +  22  +  ... +  112     V  J- 
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Or,  if  we  return  to  the  notation  originally  used,  where  the  time  of  the  first 
passage  is  to, 


T=- 


-(1+2+.. .4-11)^0 
12  +  22+. ..  +  112 


It  should  be  noted  that  this  result  is  entirely  different  from  the  one 
originally  obtained  in  equation  (2).  This  method  of  treatment  is  fallacious, 
because  the  error  of  noting  the  time  of  the  first  swing  enters  into  each  of  the 
equations  (1)',  ti,  t2,--'hiy  being  the  differences  of  two  observed  times.  The 
method  of  least  squares  is  only  applicable  to  observational  equations  which 
are  independent,  and  it  is  therefore  not  permissible  to  apply  it  to  equations  (1)'. 
The  only  correct  method  of  dealing  with  this  problem  is  to  write  down 
equations  (1),  and  solve  them  by  the  method  followed  above. 

Example  5.  Form  the  observational  equations  for  the  rate  and  error  of 
a  sidereal  clock,  from  the  following  set  of  observations.  Derive  the  normal 
equations,  and  solve  them. 


Star 

Observed  transit 

K.A. 

7)     Tauri 

3h  38m    78-64 

8«-08 

A    Tauri 

3     55     23-16 

23  -62 

i    Tauri 

4     53    41  -19 

41  -65 

/3    Tauri 

5     16    20  -56 

21  -05 

(The  error  of  a  clock  at  time  t  is  a  +  ht  where  a  and  h  are  constants.) 

Example  6.     From  the  equations 

X  =     2, 

x+y  =     3, 

x—y-\-z     =  — 1, 
x-Zy  +  ^z=  -2, 
form  the  normal  equations,  and  solve  for  x^  y,  z. 

40.     Solution  of  the  Normal  Equations. 

In  examples  2  and  5  above,  there  are  only  two  unknowns  in 
each  case,  so  that  there  are  only  two  normal  equations ;  while  in 
examples  1  and  6  there  are  three  unknowns,  but  the  coefficients 
of  these  unknowns  in  the  normal  equations  are  integral.  In  all 
four  cases,  therefore,  the  equations  can  be  solved  by  the  ordinary 
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methods  of  elementary  algebra.  When  the  number  of  unknowns 
is  greater  than  2,  and  the  coefficients  in  the  normal  equations  are 
not  integral,  the  solution  has  to  be  carried  out  by  one  of  the 
special  methods  discussed  in  the  next  chapter. 

One  of  the  difficulties  that  arise  when  we  attempt  to  solve  the 
equations  in  the  haphazard  w^ay  of  ordinary  algebra  is  that  of 
determining  the  number  of  places  of  decimals  to  retain  at  each 
step.  It  is  one  of  the  merits  of  Gauss's  method  (to  be  discussed 
in  the  next  chapter),  that  it  prevents  our  getting  into  difficulties 
over  this. 


CHAPTER  VI 

EVALUATION    OF    THE    MOST    PROBABLE    VALUES    OF    THE 
UNKNOWNS,   THEIR  WEIGHTS    AND    PROBABLE    ERRORS 

41.     Gauss's  Method  of  Substitution. 

For  the  sake  of  simplicity  in  writing,  we  shall  suppose  that 
there  are  three  unknowns,  a;,  y,  z,  but  the  method  can  be  auto- 
matically extended  to  include  any  number  of  unknowns. 

Let  the  normal  equations  be 

[aa]  f€  4-  [a6]  y  +  [ac]  z  —  [al]  =  0 

+  [bh]y  +  [bc]z-[bl]  =  0[ (i). 

+  [cc]z-[cl]  =  0 
From  the  first  equation  we  find 

^=_MI,,_[^]  .4.M  (ii) 

[aa]'^      [aa]        [aa] 

Substituting  this  value  in  the   second  and  third  equations,  we 
obtain  the  following  equations  : 

[bbl]y-h[bcl]z-[bn]=0\ 
+  [ccl]z-[cll]  =  Ol 
[ab]  [ab]  ] 


,(iii). 


where  [bbl]  =  [bb] 

[bcl]  =  [bc] 


[aa] 
[ab]  [ac] 


[aa] 
etc. 
From  the  first  equation  in  (iii), 


•(iv)> 


[hcl]        [Ul] 
y        [661]    ^[661]  ^  '- 
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Substituting  this  value  in  the  second  equation  (iii),  we  find 

[cc2]z-[cl2]  =  0  (vi), 

[bcl][bcl]] 


where 


[cc2]  =  [ccl] 


[cl2]  =  [ell] 


[661] 

[6cl1  [Ml] 
[661] 


(vii). 


Equation  (vi)  gives  the  value  of  2.  Substituting  this  value 
for  z,  in  equation  (v)  we  evaluate  y,  and  then  by  substitution 
for  y  and  2  in  equation  (ii)  we  obtain  the  value  of  x.  The 
equations  from  which  the  values  of  the  unknowns  are  deduced 
are  here  collected  for  convenience  of  reference. 

[cl2]  \ 

[cc'2] 

[6c  1] 


z  = 


y 


[6a] 

[661]  ^"^[661] 


(viii). 


[ah] 
[aci]  ^ 


[ac]  ^  ^   [al] 
[aa]        [aa] 


The  solution  for  four  unknowns  is  similar  to  the  above,  the 
work  being  carried  through  a  further  stage,  involving  the 
evaluation  of  [ddZ],  etc.  In  practice,  the  order  of  procedure 
is  slightly  varied  from  that  shown  above,  but  the  method  can 
be  best  understood  by  the  actual  solution  of  a  set  of  normal 
equations. 

42.     Checks  in  Computation. 

As  the  solution  of  a  system  of  normal  equations  involves  a 
considerable  amount  of  arithmetic,  it  is  advisable  to  check  the 
correctness  of  the  work  at  each  stage. 

a.     In  the  first  place,  it  can  be  shown  that  the  leading  co- 
efficients [aa],  [661],  [cc2],  etc.,  are  all  positive.      Since  [aa]  is 
the  sum  of  a  number  of  squares,  it  must  be  positive.     Again 
[aa]  [66 1]  =  [aa]  [66]  -  [ah]  [tih] 
=  X(a,.6s-as6,y 
=  a  positive  quantity* 

=  [A7q,say. 
Hence  [661]  is  positive. 

*  It  follows  that  [aa]  [bb]  is  greater  than  [abf. 
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Similarly    [aa]  [cc  1]  =  1  {a,Cs  -  agCrf  =  [LL],  say. 
With  this  notation, 

[aa]  [bcl]=  [aa]  [be]  -  [ac]  [ah] 

=  1  {(arbs  -  asbr)  (arCg  -  asCr)} 
=  [KLl 
It  follows  from  equation  (vii)  that 

[aaf  [bbl]  [cc2]  =  [KK]  [LL]  -  [KL]  [KL] 

=  l(KrL,-KsLry 

=  a  positive  quantity. 
Hence  it  follows  that  [cc2]  is  a  positive  quantity. 

Since 

[aa]  [bbl]  =  [KK],  [aa]  [6c  1]  =  [KL],  etc., 
it  follows  that  the  system  of  equations  obtained  after  elimi- 
nating OS  is  similar  to  the  original  system  of  normal  equations, 
in  that  all  the  coefficients  in  the  leading  diagonal  are  positive, 
while  the  other  coefficients  are  symmetrical  about  this  diagonal. 
The  leading  coefficient  which  results  from  eliminating  two 
unknowns  from  this  set  of  equations,  will  be  of  the  same  nature 
as  [cc2];  i.e.  it  will  be  positive.  This  coefficient  is  [c?c?3].  This 
line  of  argument  can  be  applied  to  any  number  of  unknowns. 

The  calculation  of  [661],  [cc2],  etc.,  may  in  some  measure  be 
checked  by  the  result  deduced  above,  a  negative  value  of  such 
a  coefficient  indicating  an  error  in  computation. 

^.  A  useful  check  on  the  calculations  made  in  solving  the 
normal  equations,  may  be  obtained  by  a  simple  extension  of  the 
checks  on  the  formation  of  the  normal  equations  (see  §  37).  The 
formation  of  the  equations  is  checked  by  means  of  quantities 
[as],  [bs],  ...,  [Is].  If  we  operate  upon  these  quantities  in  the 
same  way  as  upon  [al],  [bl],  etc.,  we  can  form  new  quantities 
[6s  1],  [csl],  [cs2],  etc.  When  there  are  only  three  unknowns,  it 
is  easily  shown  that 

[bbl]  +  [bcl]  +  [bll]  =  [bsl], 

[bcl]-^[ccl]  +  [cll]  =  [csl]. 
After  the  next  stage  of  elimination  we  have 
[cc2]  +  [cl2]  =  [cs2l 
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Similar  relations  will  hold  for  any  number  of  unknowns. 
A  check  of  this  kind  is  in  general  sufficient  to  detect  any  errors 
of  computation.  The  only  additional  work  involved  in  the 
carrying  on  of  the  check  is  the  addition  of  a  column  to  the 
tabular  form  in  which  the  work  is  carried  out. 

7.  An  additional  check  may  be  got  by  reversing  the  order 
of  elimination  of  the  unknowns,  but  this  involves  considerable 
labour. 

8.  When  the  unknowns  have  been  evaluated,  the  substitution 
of  their  values  in  the  observational  equations  yields  the  values  of 
the  residuals  v^,  V2,  etc.  A  number  of  useful  checks  on  the  work 
can  be  derived  from  these  residuals.  It  has  already  been  shown 
that 

]^av']  =  [hv']  =  [cv'\=  ...  =  0. 

Any  of  these  equations  may  be  used  as  a  check  on  the  final  results 
of  solving  the  normal  equations. 

e.     If  the  observational  equations,  written  in  the  form 

iirX  +  h,.y  +  c,.z  —  lr=  V,.         (v  =  1,  2,  3,  etc.) 
be  multiplied  by  Vi,  V2,  v^,  etc.,  and  added,  we  obtain  the  equation 

-  [vl]  =  [vv], 
since  the  coefficients  of  x,  y,  z,  etc.,  vanish. 

If  the  observational  equations  be  multiplied  by  ^1,  4,  etc.,  and 
added,  we  obtain  the  equation 

[al]  X  +  [hi]  y  +  [cl]z-[ll]  =  [Iv]  =  -  [vv] (ix). 

Now  take  equations  (viii)  and  substitute  in  the  last  equation  (ix) 
for  Xf  y,  z,  in  turn.     Substituting  for  x,  we  obtain 

-[bn]y-[cll]z  +  [in]  =  [vvl 
Substituting  for  y,  from  the  second  of  equations  (viii), 

-[cl2]z  +  [ll2]  =  [vv]. 
Finally  substituting  for  z,  from  the  first  of  equations  (viii), 

[llS]=[vv]. 

[cl2f 


Hence  [vv]=  [ll'S]=  [112] 

or  [vv]  =  [II] 


[cc2] 
aiy      [bllf     [cl2Y 


[aa]      [bbl]      [cc'. 
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Any  of  the  equations  given  above  might  be  used  as  a  check  on 
the  final  result.  Perhaps  the  most  useful  check  is  to  compare  the 
value  of  [yy]  obtained  by  squaring  the  calculated  residuals,  with 
the  value  of  [vv}  obtained  from  the  last  equation.  In  any  case  the 
value  of  [vv]  will  be  required  when  we  come  to  the  problem  of 
evaluating  the  probable  errors  of  the  adjusted  values  of  the  un- 
knowns. 

It  should  be  noted  that  the  last  form  for  [vv]  is  capable  of 
immediate  generalisation  for  any  number  of  unknowns,  say  ?7l 
We  then  have 

[a/p      [bllf 
M  =  [^^H  =  m-f-^]-f^-etc., 

where  there  are  m  terms  after  [II]  on  the  right-hand  side  of  the 
equation. 

43.     Form  of  Solution. 

In  any  piece  of  work  it  is  a  decided  advantage  to  follow  a 
systematic  method  of  solution,  in  which  the  evaluation  of  co- 
efficients and  the  application  of  checks  is  to  a  large  extent 
mechanical.  Table  A  shows  a  convenient  form  of  solution  for 
four  unknowns. 

When  there  are  only  three  unknowns,  there  will  be  no  terms 
involving  d,  and  only  the  first  eleven  lines  in  the  table  need  be 
evaluated.  When  an  arithmometer,  or  a  table  of  products,  is 
used,  the  solution  can  be  worked  out  line  by  line  as  shown  in  the 
table.  The  form  can  be  varied  to  suit  individual  tastes.  The 
last  column  indicates  how  the  various  lines  in  the  table  are 
derived  from  the  preceding  lines.  (Ch.)  indicates  the  lines  in 
which  the  checks  are  to  be  applied.  When  the  work  is  to  be  done 
by  the  use  of  logarithmic  tables,  the  order  of  computation  need 
not  be  altered,  but  it  is  necessary  to  put  in  some  additional  lines 
containing  logarithms.  We  shall  solve  by  the  use  of  logarithmic 
tables,  following  the  order  of  Table  A  as  closely  as  possible,  the 

following  set  of  normal  equations  : 

Check 

153  "000^^7 +  6 -285?/+   2 -485  3-27 -831  vj=   22-093  156-032 

+  8-989y+   4-037S-   0-4262/;= -3-855  15*030 

+  23-616^-   3-504v;= -9-952  16-682 

+   9-080w=     7-251  -15-430 
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The  solution  of  this  set  of  equations  is  given  in  Table  B.  In 
this  table  the  order  of  solution  is  that  indicated  in  Table  A.  The 
last  column  indicates  a  comparison  between  this  table  and  Table  A, 
and  the  additional  lines  necessitated  by  the  use  of  logarithmic 
tables  are  indicated  by  accented  numbers  in  the  first  column. 

A  few  points  of  procedure  deserve  notice.  Line  1'  contains 
the  logarithms  of  the  quantities  in  line  1.  The  logarithm  of  a 
negative  quantity  is  expressed  by  writing  down  the  logarithm  of 
the  quantity  concerned,  with  7i  prefixed,  to  indicate  that  we  are 
dealing  with  a  negative  quantity.     Thus 

log  -  27-831  =  71 1-4445288, 

1-4445288  being  log  27-831.  Two  ?i's  added  or  subtracted  an- 
nihilate one  another.  Line  2  is  derived  from  line  1'  by  sub- 
tracting the  first  term  in  line  V  from  all  the  others.  This  can  be 
conveniently  done  by  writing  the  first  term  at  the  bottom  of 
a  piece  of  paper,  and  carrying  the  piece  of  paper  along  line  V, 
so  that  this  term  may  be  subtracted  in  turn  from  each  of  the 
other  terms.  To  derive  line  4  from  line  2,  we  first  add  0*7983053 
to  each  term  in  line  2,  and  find  from  the  tables  the  numbers 
corresponding  to  the  sums.  These  numbers  are  written  down  in 
line  4.  The  remainder  of  the  work  follows  the  same  method. 
The  final  solution  is 

^=     0-727, 

3/ =  -0-836, 
2  =  0-093, 
w=     3-026. 

44.     The    Doolittle   Method    of  Solution. 

A  variant  of  Gauss's  method  of  substitution,  due  to  Mr  Doolittle, 
of  the  U.S.  Coast  and  Geodetic  Survey*,  yields  a  considerable  gain 
in  speed,  and  saves  much  labour  by  reducing  the  number  of  entries 
from  the  tables  to  a  minimum.  Take  the  case  of  three  normal 
equations, 

[aa]  X  -\-  [a6]  y  +  [ac]  z  —  [aZ] 

+  \h})\  y  -r  [he]  z  =  [bl] 

4-  [cc]  z  =  \d\ 

*  Coa&t  and  Geodetic  Survey,  Report  for  1878,  Appendix  8,  pp.  115-118. 
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The  notation  used  in  the  solution  is  that  of  Gauss's  method.     The 
order  of  solution  is  shown  in  tables  C  and  D. 

Table  C. 


1 

[aa] 


1 

[661] 


1 
[cc2\ 


X 

y 

z 

[aa] 

x  = 

[a6] 

[a6] 
[aa] 

[ac] 

[ac] 
[H 

+[-H 

[661] 

[6ol] 

[6ol] 
[661] 

[6n] 

+  [661] 

[cc2] 

z  — 

-\_cli\ 

[oZ2] 
[co2] 

Table  D. 


y 

z 

[66] 

-[a,6]fi 
^     -■  [ad] 

[6c] 

*-     -■  [aa] 

-[U] 

[.6]M 
-•[aa] 

[cc] 

r6cii  [*"] 

'-""^J  [661] 

The  coefficients  of  the  first  normal  equation  are  written  in  line  1, 
table  C.  The  reciprocal  of  [aa],  with  its  sign  changed,  is  written 
in  the  first  column,  and  all  the  other  quantities  in  line  1  are 
multiplied  by  it,  the  results  being  entered  in  line  2.  Line  2  then 
gives  X  as  an  explicit  function  of  y  and  z.  The  coefficients  of  the 
second  normal  equation  are  written  in  line  1,  table  D.  Line  2, 
table  C,  is  now  multiplied  by  [ah],  and  the  results  entered   in 
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table  D,  line  2.  The  sum  of  lines  1  and  2,  table  D,  is  entered  in 
table  C,  line  3.  The  reciprocal  of  [661]  is  written  in  the  first 
column  of  table  C,  and  all  the  other  quantities  in  line  3  are  multi- 
plied by  this  reciprocal,  the  results  being  entered  in  line  4.  This 
line  gives  y  as  an  explicit  function  of  z.  Next,  the  coefficients  of 
the  third  normal  equation  are  entered  in  line  3  of  table  D.  The 
terms  in  lines  2  and  4  of  table  C,  beginning  only  at  the  z  terms, 
are  multiplied  by  [ac\  and  [6c  1]  respectively,  and  the  results 
entered  in  table  D,  lines  4  and  5.  The  sum  of  lines  3,  4  and  5  of 
table  D  is  entered  in  table  C,  line  5.  Line  6,  table  C,  is  obtained 
by  first  writing  down  the  reciprocal  of  [cc2],  with  its  sign  changed, 
and  multiplying  the  other  term  of  line  5  by  this  residual.  Line  6 
gives  the  value  of  z.  The  values  of  y  and  x  are  obtained  by 
successive  substitution  in  lines  4  and  2  of  table  C. 

This  method  of  solution  can  be  easily  extended  for  any  number 
of  unknowns.  Its  advantage  over  the  older  method,  due  to  the 
reduction  of  entries,  is  even  greater  for  larger  numbers  of  un- 
knowns than  for  three  unknowns  as  discussed  above.  In  practice 
it  is  convenient  to  make  tables  C  and  D  on  separate  sheets  of  paper, 
and  to  fold  table  C  along  alternate  lines,  to  facilitate  the  carrying 
of  the  numbers  of  table  C  to  where  they  are  required  in  table  D. 

45.     Solution  by  the  Method  of  Determinants*. 

The  solution  of  the  normal  equations 

[aa]  X  +  [ah]  y  +  [ac]  z  =  [al] 
+  [bh]y-^[hc]z  =  [bl] 
+  [cc]  z  =  [cl] 
may  be  immediately  written  down  in  determinantal  form 


[al] 

[ah] 

[ac] 

[aa] 

[al] 

[ac] 

1 

[hi] 

[bh]     [ho] 

.     y  = 

1 

D 

[ab] 

[hl\ 

[be] 

> 

[d] 

[bo]     [cc] 

[ac] 

[cl] 

[cc] 

etc.. 

where 

D  = 

[iia]     [ab] 
[ah]     [bh] 
[ac]      [be] 

[ac] 
[he] 
[cc] 

*  This  method  is  i 

aot  a  very  practical  one  ir 

I  general.     It 

is  given  here  because 

it  is  useful  in  the  development  of  the  theory. 

7- 

-2 
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These  results  may  be  extended  to  any  number  of  unknowns. 
The  determinant  D,  which  is  symmetrical  about  the  leading 
diagonal,  contains  as  many  rows  and  columns  as  there  are  un- 
knowns. 

The  above  solution  leads  to  a  form  for  calculating  [vv]  which 
is  occasionally  useful.     It  has  already  been  shown  (page  93)  that 
[vv]  =  [II]  -  [al]  X  -  [hi]  ij  -  [d]  z. 


D  [vv]  =  [II]  X 


[hi] 


[<m] 

[ab] 

[ac] 

[ab] 

[ac]     [al] 

[ah]  m 

[be] 

+  [al] 

[bb]     [be]     [bl] 

[ac-\     {bc\ 

[oc] 

[be]     [cc]     [el] 

[aa]     [ac] 

[aq 

[aa]     [ah]     [al] 

[afc]     [6c] 

[bl] 

+  [cl] 

[ah]      [hh]     [hi] 

[ac]     [cc] 

[cl] 

[ac]     [he]     [cl] 

[aa] 

[ab]     [ac]     [al] 

..]  = 

[ah-] 

[bb]     [he]     [bl] 

[ac] 

[be]     [cc]     [ol] 

[al] 

[bl] 

[cl] 

[11] 

This  is  a  determinant  formed  from  D  by  the  addition  of  a  fresh 
row  and  a  fresh  column  containing  the  I  terms. 

46.  Probable  Error  of  an  Observational  Equation  of 
Unit  Weight,  when  there  are  n  Observations  involving 
m  Independent  Unknowns. 

If  the  observations  be  not  all  of  the  same  weight,  each  ob- 
servational equation  is  first  multiplied  by  the  square  root  of  its 
weight,  so  as  to  reduce  all  the  residuals  to  unit  weight.  Let  the 
system  of  residuals,  thus  reduced  to  unit  weight,  be  v-i_,  V2,  etc. 
If  the  error  law  followed  by  the  residuals  be 

h     -.h^A2 

then  the  a  priori  probability  of  the   given  system  of  residuals. 
v-i_,  Vcy,  etc.,  is 

where  C  is  a  numerical  constant  (cf.  §  14).     This  probability  may 
be  written 


Ch'^e 


Ji-^2,  {a^.x  +  b^y  +  Cj.z+  ...  -  /^)^ 
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But  X,  y,  etc.,  are  all  unknown  and  independent,  and  may,  as  far 
as  we  know,  take  any  values  between  —  x  and  +  oo  .  The  com- 
plete probability  is  therefore  got  by  integrating  the  last  expression 
between  limits  +  oo  for  each  of  the  unknowns.  The  probability 
then  becomes 

+  00      +00  +00 

C|  J jh«e-"'^^''^''*''''>^--''^'dccdydz.... 

—  »      —  00  —00 

m  Integrations 

In  order  to  integrate  with  respect  to  x,  we  write  the  coefficient  of 
—  h^  in  the  exponential  term,  in  the  form 

P-{-(Qx  +  Ry, 
where  Q  is  a  numerical  factor,  and  P  and   R  are   functions  of 
y,  2,  etc. 

—  00  —  00 

The  factor  y-  is  a  numerical  factor  which  depends  only  on  the 

values  of  the  coefficients  in  the  observational  equations.  We  may 
regard  it  as  being  absorbed  in  the  constant  C. 

The  expression  for  the  probability,  with  x  eliminated,  becomes 


CJ    f j h^-'e~^''^dyd2., 


{m  -  1)  Integrations 

Here  P  is  the  quadratic  function  of  y,  z,  etc.,  which  results  from 

'%{arX'\-hry  +  ... —Ir)', 
when  X  is  so  chosen  as  to  make  this  last  expression  a  minimum. 
It  follows  that  after  m  integrations,  the  probability  becomes 

where  (7  is  a  constant,  and  T  is  the  value  of  2(a,.a;  +  6;.y+...  —  Ir)^ 
which  results  from  giving  x,  y,  z,  etc.,  the  values  which  make  this 
sum  a  minimum.     But  these  are  the  values  of  x,  y,  z,  etc.,  which 
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are  determined  by  the  ordinary  least  square  solution,  using  the 
normal  equations.     It  follows  that 

The  total  probability  of  the  given  system  of  residuals  is  thus 

The  value  of  the  parameter  h  which  determines  the  scale  of  the 
error  curve  has  to  be  selected  so  as  to  make  the  probability  of  the 
given  system  of  residuals  a  maximum.     Taking  logarithms,  and 
differentiating  with  respect  to  h,  we  find 
71  —  m 


h 
or  h 


2h  [vv]  =  0, 


-V   2 


2  [vv]  ' 
The  M.s.E.  //,  of  an  observation  of  unit  weight  is  thus 

and  the  RE.  r  of  an  observation  of  unit  weight,  or  an  observational 
equation  of  unit  weight,  is  given  by* 

r  =  0-674.5  Vi^. 
V    n  —  iyi 

The  quantity  [vv]  may  be  evaluated  in  a  number  of  ways. 
The  obvious  method  is  to  calculate  it  from  the  actual  residuals, 
and  since  it  is  generally  advisable  to  test  the  character  of  the 
work  by  evaluating  the  residuals  and  considering  their  magnitude, 
this  is  perhaps  the  most  satisfactory  method.  But  it  is  also 
possible  to  calculate  [vv]  from  any  of  the  relations  shown  below. 

[vv]  =  —  [vl], 

[vv]  =  [II]  —  [al]  X  —  [bl]  y  —  [cl]  z  —  etc., 

[vv]  =  \ll]  -  ^^  -  Yn-^  -  )-^  -  etc. 
■-  ■•      [aa]      [obi]      [cc2] 

Or  any  one  of  these  relations  may  be  used  as  a  check  upon  the 
value  obtained  by  squaring  the  residuals. 

*  Or  r  =  0-6745  \       - —  ,  where  v  is  the  residual  in  an  observational  equation 
V   n-m 

of  weight  -p. 
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47.     Evaluation  of  Probable  Errors  of  the  Unknowns. 

Since  the  normal  equations  are  linear  in  x,  y,  z,  etc.,  and  also 
in  ^1,  ^2>  etc.,  we  may  write 

^  =  otiZi  +  a2^2  +  etc., 

2/  =  /3i/i  +  ^J2  +  etc., 


where  a.-^,  olo,  ...,  A?  ^-2,  •-•,  etc.,  are  functions  of  the  coefficients 
a,  b,  c. 

The  P.E.  of  an  observational  equation  of  unit  weight,  or  the 
p.E.  of  each  of  the  ^'s,  is  the  quantity  r  which  has  already  been 
deduced  above.  If  r^,  Vy,  etc.,  be  the  P.E.'s  of  the  unknowns,  it 
follows  that 

It  is  therefore  necessary  to  evaluate  [aa],  [/3/S],  etc.  The  re- 
ciprocals of  these  sums  may  be  regarded  as  the  weights  of  x,  y,  etc. 
Calling  these  Px,  Ptj,  etc.,  we  have  the  relations 

1 


Pa 


[««] 


Pxrx^  =  r\ 


Returning  to  the  solution  of  the  normal  equations  by  deter- 
minants, we  may  write 

[aV]     [ah'\     [ac\ 


x=^[hl]     [hb]     [be] 


aiZi  +  0f2^2  4- ...  . 


=  cwA  +h,B-\-c^G, 


[cl]      [he]      [cc] 

Collecting   the   terms  in    ^^  in   the   determinant,  we    have    the 
relation 

«!  [ab]  [ac] 

Dct,=    b,  [bb]  [be] 

Cl  [be]  [cc] 

where  A,  B,  C  are  the  minors  of  a-^,  b^,  Ci,   in  the  determinant. 
Squaring  this  expression,  we  may  write  the  result  as 

JD^a-  =  A  (aiUjA  -\-  a^b^B  +  chcfi) 

+  B  {ciAA  +  bAB  +  b.cfi) 

-f-  G  {a^c.A  -H  b.c^B  -|-  c^cfi). 


D. 
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Summing  for  all  possible  suffixes  we  find 

D'  [aa]  =  A  {[aa]  A  +  [ah]  B  -f  [ac\  C} 

+  B{[ah'\A+[hh]B+\hc'\C] 

+  C  [[ac-]  A  +  [he]  B  +  [cc]  G] 

=  A  {[aa]  A  +  [ah]  B  +  [ac]  C]=AD. 

For  if  the  second  and  third  brackets  be  written  in  the  form  of 
determinants,  two  columns  of  each  determinant  would  be  the 
same,  and  so  their  values  would  be  zero. 

^        1     lah]     [< 
Hence  [aa]  =  -^  =    0     [66]     [. 

0     [6c]     [ 

Comparing  this  with  the  value  of  x  deduced  above,  we  see  that 
[aa]  is  the  value  of  x  which  we  should  deduce  if  we  put 

[al]  =  l,     [hl]  =  [cl]  =  0. 

Similarly  [/3yS]  is  the  value  of  y  obtained  when  we  put 

[al]  =  [d]  =  {)   and    [6/]  =  1. 

!  [aa]     tti     [ac] 
Again  D . /3  = '  [ah]     h,     [he] 

I  [ac]      Ci      [cc] 

where  B,  F,  and  G  are  the  minors  of  a^,  h^,  c^,  in  the  determinant. 
B  is  the  same  as  the  B  given  above. 

We  thus  have       Da  =  a^A  +  h^B  +  c^C. 

Dp  =  a,B  +  h,F  +  cfi. 

.-.     D^oLp  =  A[a^a^B-\-aAF-{-a^c^G] 

-{■B[aAB  +  hAF-^h,c,G] 

+  C  {a^c.B  +  hiC^F -h  CiC^G} 

n'[oLfi]  =  A  {[aa]  B  +  [ah]  F  +  [ac]  G} 

+  B  {[ah]  B  +  [hh]  F  +  [he]  G} 

+  C  {[ac]  B  +  [he]  F  +  [cc]  G] 

=  D.B, 


=  a,B  +  h,F  +  c,G, 


VI] 


SOLUTION    OF   THE    NORMAL   EQUATIONS 


105 


the  first  and  last  brackets  being  zero,  as  can  easily  be  shown  by 
writing  them  in  determinant  form.     It  follows  that 

[aa]     1     [ac]  0     [ah]     [ac] 

I)[a0]  =  B=    [ab]     0     [be]    =i  1     [bb]     [be] 

[ac]     0     [cc]  0     [be]     [cc] 

[ol/S]  is  therefore  the  value  of  y  obtained  when 

[al]  =  1,     [bl]  =  [cl]  =  0, 

or  the  value  of  x  obtained  when 

[al]=:[cl]  =  0,     [bl]  =  l. 

48.     Evaluation  of  the  Weights  of  x,  y,  z,  etc. 

The  weight  of  the  unknown  which  is  first  evaluated  in  the 
Gauss  solution  can  be  immediately  deduced.  If  in  the  equations 
of  §  41,  we  put 

[al]  =  [bl]  =  0   and    [cl]  =  l, 

then  [6/1]  =  0,    and    [cl2]=l. 

Equation  (vi)  then  yields  [cc2]  [77]  =  1, 


or 


i'^=[:^i  =  [-2]. 


The  quantity  [cc2]  is  one  of  the  quantities  evaluated  in  the 
process  of  solving  the  normal  equations.  In  the  general  case 
where  there  are  7n  unknowns,  the  weight  of  the  last  unknown  in 
the  order  of  elimination  is  the  coefficient  of  that  unknown  in  the 
last  elimination  equation.  Thus  in  table  A,  page  95,  the  weight 
of  w  is  given  by 

p^  =  [ddS]. 

The  weight  of  any  unknown  might  be  evaluated  by  making  that 
particular  unknown  the  last  in  the  order  of  elimination.  If  the 
normal  equations  involve  a  large  number  of  unknowns,  of  which 
it  is  only  necessary  to  find  the  weights  of  a  few,  it  is  a  consider- 
able advantage  to  eliminate  last  the  unknowns  whose  weights  are 
required. 

The  following  special  cases  should  be  particularly  noted. 
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(a)     When  there  are  only  two  unknowns,  x  and  y. 
With  the  usual  notation 

[aa]  .  [66]  -  [ah]  .  [a6] 


Py=[hh\]  = 


[aa] 


And  similarly,  ;,.  =  ^^"^  ^^\^'^^  ^"^^  • 

(6)     When  there  are  three  unknowns,  x,  y,  z. 
It  has  been  shown  in  §  47,  that 

^^  "  [aa]  ~  A  \bh\  \c6\  -  [6c]  [6c] ' 
Similarly 

D  B 

^^  "  [cc\  [aa]  -  [ac]  \ac]  '  ^'  ~  [aa]  [66]  -  [a6]  [a6] ' 

where  D  is  the  determinant  formed  by  the  coefficients  of  the 
normal  equations.     It  may  also  be  written 

D  =  [aa]  [66]  [cc]  +  2[a6]  [6c]  [ca]  -  [aa]  [hcf  -  [66]  [ca^  -  [cc]  [a6]l 

When  the  normal  equations  contain  only  integral  coefficients, 
it  is  generally  simpler  to  solve  them  by  the  methods  of  ordinary 
algebra,  and  to  calculate  the  weights  by  means  of  the  above 
equations. 

(c)     For  any  7iumher  of  unhioiuns,  it  is  easy  to  write  down 
the  weight  of  the  last  but  one  of  the  unknowns. 

Take  the  case  of  four  unknowns  eliminated  in  the  order 
X,  y,  z,  w.  Then  the  weight  of  w  is  [c?c?3].  If  we  re-modelled  our 
solution,  and  eliminated  in  the  order  x,  y^  w,  z,  the  auxiliary 
quantities  evaluated  in  the  solution  would  remain  unaltered  as 
far  as  affix  2.  The  normal  equations  remaining  after  eliminating 
X  and  y  are 

{dd2'\w  +  \cd2]z~[dl2]  =  Q, 

[ccZ2]'?(;  +  [cc2]0-[cZ2]  =  O. 
From  the  first  of  these, 

[cd2]    [dm 

{ddiy    [dd^Y 
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Substituting  this  in  the  last  equation,  we  find 

Pz  =  coefficient  of  z  in  the  final  equation 

_[cc2][dd3] 
^'"      [dd2]      ■ 
Thus  Pz  is  easily  determined.     For  [cc2]  is  the  last  coefficient 
in  the  [cc]  column,  [ddS]  the  last  coefficient  in  the  [dd]  column, 
and  [dd2]  the  last  coefficient  but  one  in  the  [dd]  column. 

Thus  in  the  example  worked  in  table  B,  page  97,  p^v  =  3-436, 

and 

_  21-803  X  3-436 
^'~         3^958  • 

EXAMPLES. 

1.     Continuation  of  Ex.  1,  page  82. 

To  find  the  weights  and  p.e.'s  of  a;,  y,  z.     Using  the  formulae  of  case  (6) 

above,  we  find 

i)  =  27x  15x54 -27 -54x62  =  19899, 

19899         ^ , ^ 
^^=15^^5431  =  24-6, 

19899      _^, 
^^  =  27^^=^'^''' 

^^^^^  54-0. 


^'     27x15-36 

Substituting  the  derived  values  of  the  unknowns  in  the  observational 
eo  nations,  we  obtain  the  following  residuals, 

+  •24     +-09     --11      --06. 
The  p.E.  of  a  single  observational  equation  is 

0-6745  v/(-24)2  +  (.09)2  +  (-11)^  +  (-06)2, 
or  0-19. 

^•^^       0-038, 


V24-6 
..  =  ^  =  0-052, 

r,=  =0-026. 


\/54-0 
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2.  Solve  the  normal  equations  of  Ex.  2,  page  84,  evaluate  the  weights  of 
the  unknowns  .v  and  y,  and  from  the  formula 

ipilj      IpUj      |-^^^-|        ^^^^^-| 

find  the  p.e.  of  an  observational  equation  of  unit  weight,  and  the  p.e.'s  of 
o)  and  y. 

3.  Verify  the  result  of  the  last  example  by  evaluating  [pvv]  from  the 
residuals. 

4.  Prove  that  in  the  case  of  four  unknowns  .r,  y,  z,  w,  eliminated  in  this 

order,  the  weight  of  y  is 

_[bbl][cc2][dd3] 
^^~     [ccl][dd2l      ' 

where  all  the  auxiliaries  have  the  usual  meaning  except  [dd^]^,  which  is  the 
value  of  [dd2]  when  the  unknowns  are  eliminated  in  the  order  ^,  z,  2v,  y. 

The  product  [aa]  \hh\\  [cc2]  \dd'S\  is  the  common  denominator  of  x^  y,  z,  w, 
when  the  expressions  for  the  unknowns  are  reduced  to  the  same  denominator. 
The  value  of  the  j)roduct  is  therefore  independent  of  the  order  of  elimination. 

.-.    [aa]  [661]  [c:c2]  \dd3'\  =  \aa'\  [ccl]  \_dd2\  [663], 

whence  the  result  immediately  follows. 
Deduce  from  table  B  the  weight  of  y. 

5.  Show  for  the  case  of  three  unknowns,  that  the  weights  of  the 
unknowns  must  always  be  positive. 

6.  In  Ex.  3,  page  86,  find  the  weights  of  the  unknowns. 

7.  In  the  same  example  show  that 

\yv\  =  [/^]  -  36s'-^  -  18  (.^2  ^^2), 

Write  down  the  equations  of  condition  in  the  form 

Vy. •\-ly.=x  cos  6-^y  sin  6-\-z. 
Multiply  by  ly.  and  add,  remembering  the  relation 

\yv\  =  -  \yT\. 

49.  An  Alternative  Method  of  finding  the  Weights  of 
the  Unknowns. 

It  has  been  shown  that  the  weight  of  x  is  the  value  of  x 
derived  by  solving  a  new  set  of  normal  equations  in  which 
[aZ]  =  l,     [6/]  =  [d]  =  ...=0. 

The  weight  of  each  unknown  can  in  fact  be  derived  by  solving  the 
appropriate  set  of  normal  equations.  In  practice  the  weights  can 
be  conveniently  found  by  combining  the  solution  of  all  these  sub- 
sidiary sets  of  normal  equations  with  the  ordinary  solution  for  the 
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unknowns.  For  the  case  of  three  unknowns,  three  additional 
columns  are  added  to  the  table  of  solution,  with  1,  0,  0  ;  0,  1,  0 ; 
and  0,  0,  1 ;  Avritten  in  positions  corresponding  to  [cd],  [hi]  and 
[cl].  If  these  columns  be  headed  i?,  >S'  and  T  respectively,  the 
values  of  z  derived  in  these  columns  are 

[«7].    [M,    [77]- 

The  derivation  of  [act]  and  [(3f3]  follows  quite  simply  from  these. 
The  method  can  be  most  clearly  understood  from  the  working  of 
an  actual  example. 

The  following  set  of  normal  equations  were  derived  by  H.  N. 
Russell  in  some  work  on  stellar  parallax  : 

8-000^  +  4-106y  +  0-698^  =  -  2588 
+  3-9892/  +  0-051^  =  -  1441 
+  4-613^=     1349. 
The  work  of  solution  is  considerably  simplified  by  changing  the 
unit  of  the  absolute  terms  to  1000  times  its  original  value.     For, 
as  the   equations  stand,  the  check  sum   for  the  first  equation  is 
2545'196;  while  in  the  equation  obtained  by  our  suggested  change 
of  unit,  the  check  sum  is  10*246,  containing  five  significant  figures 
instead  of  seven. 

In  writing  the  absolute  terms  in  the  table  of  solution,  we 
change  the  unit,  so  that  they  become  —  2'588,  — 1*441,  1-349. 

The  solution  is  given  on  page  110.  The  last  column  indicates 
a  comparison  with  the  order  of  solution  in  table  A,  page  95. 
The  last  column  but  one  shows  how  each  line  is  derived  from 
those  which  precede  it.  The  first  three  lines  in  the  check  column 
give  [as]  -\- 1,  [bs]  +  1,  [cs]  +  1. 

The  work  of  solution  was  carried  out  by  means  of  a  Brunsviga 
calculator.  Division  of  a  whole  line  by  any  quantity,  as  for 
example,  the  division  of  line  8  by  [661],  i.e.  by  1*8826,  was  carried 
out  by  multiplying  the  whole  line  by  the  reciprocal  of  [661]. 

The  last  line  of  the  table  gives  the  values  of  2,  [ay],  [fiy],  [77]. 
^  =  0-3453,     [a7]  =  - 0*0357,     [^^7]  =  0*0362,     [77]  =  0*2221. 
From  line  9, 

y  =  0*1627^-0*0616  =-0-0044. 
[/3/3]  =  0-1627  [^^7]  +  0-5312  =  0*5371. 
[a/3]  =  0*1627  [ay]  -  0*2726  =  -  0*2784. 
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From  line  4, 

^  +  0-513y  + 0-087^  =  -0-323,   and       ^  =  0-3517. 
[aa]  + 0-513  [ay8]  + 0-087  [a7]=     0-125,    and    [aa]  =  0-271. 
Hence  the  complete  solution  is : 

^=     0-3517         weight  3-68. 
2/ =  -0-0044  „       1-86. 

z=     0-3453  „       4-50. 

Changing  back  to  the  original  units 

x=  351-7  p^  =  3-68. 
y=-  4-4  ^y=l-86. 
z  =     345-3         'p,  =  4-50. 

The  method  here  used  may  be  extended  to  any  number  of 
unknowns.  If  we  have  to  solve  a  series  of  normal  equations 
involving  a  large  number  of  unknowns,  where  the  weights  of  only 
a  few  of  the  unknowns  are  required,  it  is  advisable  to  make  these 
unknowns  the  last  in  the  order  of  elimination.  If  the  weights  of 
only  two  unknowns  are  required,  the  solution  may  be  carried  out 
as  in  table  A,  page  95,  without  the  addition  of  extra  columns. 
The  coefficient  of  the  last  unknown  in  the  final  elimination 
equation  will  give  the  w^eight  of  that  unknown,  and  the  w^eight 
of  the  last  unknown  but  one  can  be  most  simply  derived  by  the 
use  of  the  expression  derived  on  page  107.  If  it  should  be  re- 
quired to  find  the  weights  of  more  than  two  unknowns,  it  is 
generally  better  to  introduce  an  extra  column  into  the  table  of 
solution,  for  each  unknown  whose  weight  is  required.  The 
weights  are  then  evaluated  as  part  of  the  general  solution  of 
the  normal  equations.  It  should  be  noted  that  the  T  column  may 
be  omitted  from  the  table  of  solution  for  three  unknowns,  since 
the  weight  of  the  last  unknown  is  the  coefficient  of  that  unknown 
in  the  final  elimination  equation. 

50.  The  order  of  solution  shown  in  the  examples  worked  out 
in  detail  in  this  chapter  can  be  varied  to  some  extent  to  suit  the 
individual  taste  of  the  computer.  When  the  coefficients  in  the 
normal  equations  are  given  to  three  or  four  significant  figures  (or 
more),  it  will  generally  be  found  convenient  to  solve  by  the  Gauss 
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method  of  substitution,  or  some  variant  of  that  method,  particu- 
larly if  there  are  more  than  three  unknowns.  If  there  are  only 
three  unknowns  some  computers  prefer  the  determinant  method, 
particularly  when  the  work  is  carried  out  by  the  use  of  an  arith- 
mometer. When  the  coefficients  in  the  normal  equations  are 
small  integers  the  determinant  method  may  be  used ;  or  it  may 
happen  that  the  equations  may  be  more  conveniently  solved  by 
the  methods  of  elementary  algebra,  as  in  Examples  1  and  6  at  the 
end  of  Chapter  V. 

The  follow^ing  example  illustrates  a  method  of  solving  the 
normal  equations,  and  of  finding  the  weights  of  the  unknowns,  by 
simple  algebra.  A  set  of  four  normal  equations  is  given,  and  the 
values  of  the  unknowns  and  their  weights  are  required.  The 
constant  terms  in  the  normal    equations   are  written   in   literal 

form. 

3^+22/  +  2^-h2i^  =  [a^] (1), 

2x-\-Sy-{-2z  +  2w  =  [bl] (2), 

2x  +  2y  +  3z  +  2iu  =  [d] (3), 

2x-h2y  +  2z  +  Sw  =  ldl] (4). 

Adding  all  four  equations,  we  find 

Q(x  +  y  +  2-\-iu)  =  [al]  +  [hi]  +  [cl]  +  [dl]. 

Adding  together  equations  (1)  and  (2), 

4>(a'.  +  y  +  z  +  w)-ha)-\-y  =  [al]  +  [bl], 

or  a^+y  =  iW  +  m]-lW-im (5). 

Subtracting  (2)  from  (1), 

x-y  =  [al]-[bl] (6). 

From  (5)  and  (6) 

7[al]-2[bl]-2[cl]-2[dl] 

X-  ^ 

-2[al]+1[bl]-2[d]-2[dl] 

y-  9 

with  similar  forms  for  z  and  w. 

The  values  of  the  unknowns  can  be  deduced  by  putting  into 
these  expressions  the  arithmetical  values  of  [aZ],  etc.  Clearly  the 
weights  of  all  the  unknowns  are  the  same,  being  equal  to  f . 


l•000.^• 

-0-061  y 

1-000 

-0-051 

1-000 

+  0-291 

1-000 

+  0-299 

1-000 

+  0-315 

1-000 

+  0-999 

1-000 

+  1-026 

1-000 

+  ]  -288 

Residuals 

=  -18 

+  20 

=  -  41 

-  1 

=  -  538 

+  34 

=  -589 

-  5 

=  -  650 

-49 

=  -139 

-66 

=  -58 

+  44 

=  -549 

+  23 

Vl]  SOLUTION    OF   THE    NORMAL   EQUATIONS  113 

Example  1.  The  normal  equations  solved  above  (p.  110)  were  derived 
from  the  following  set  of  observational  equations  : 

+  0-907  s 
+0-900 
-0-634 
-0-668 
-0-736 
-0-817 
+  0-733 
-0-621 

Verify  the  residuals  given  in  the  last  column,  and  show  that  the  p.e. 
of  an  observational  equation  is   +31-4.     Hence  show 

r^=±16-5, 
r,=  ±23-0, 
^2=  ±14-8. 
Example.     2.     Find  the  weight  of  y  in  the  table  above,  without  using  the 
columns  R,  S,  T.     (Vide  page  107.) 

Would  it  be  possible  to  find  the  weight  of  x  in  the  same  way  ?  If  not, 
what  fresh  auxiliary  products  would  have  to  be  calculated,  in  order  to  yield 
Px  without  using  column  R  ? 

51.  Alternative  Proof  of  the  Rule  for  finding  Weights 
of  the  Unknowns. 

If  r  be  the  p.e.  of  an  observational  equation  of  unit  weight, 
Px^  pij,  Pz,  etc.,  the  weights  of  the  unknowns,  and  r^,  Vy,  r^,  etc., 
the  probable  errors  of  the  unknowns,  then 

px^x'  =  r-  =  pyV.f  =pzrz'  =  etc. 
It  has  already  been  shown  in  §  45  that  we  may  write 
x  =  aJi  + a2lo+ ...    =  [^l], 
2/  =  /3x^i  +  /3,^,  +  ...  =  [/3^], 

^  =  71^1  +  7-2^2+  ...     =M, 

where  the  a's,  /3's,  and  7's  are  constants  whose  values  depend 
entirely  on  the  values  of  the  coefficients  a^,  b^,  c^,  cu,  h.,,  Co,  etc., 
of  the  observational  equations. 

Since  r  is  the  P.E.  of  each  observed  quantity  l^,  lc>,  etc., 

rx-  =  (Oi'  +  «•>'  +  . . . )  ^''  =  [a«]  r\ 
1 

And  similarly      Pu  =  ^y         ^''^WiY 

B.  o.  8 
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We  now  have  to  find  the  values  of  [aa],  [^/3],  [77],  etc. 
In  the  subsequent  work  we  shall  limit  our  discussion  to  the 
case  of  three   unknowns,  but  the  results   will   all  hold    for  any 
number  of  unknowns. 

Substituting   x  =  [al],  y  =  [/3q,  z  =  [yl],  in  the  normal  equa- 
tions, we  find 

[aa]  [al]  +  [ab]  [^  +  [^c]  [7^]  ~  [al]  =  0, 

[ah]  [al]  +  [hh]  [01]  +  [he]  [yl]  -  [hi]  =  0, 

[ac]  [al]  +  [he]  [m  +  [oc]  [yl]  -  [d]  =  0. 

These  relations  must  be  identically  true  for  all  values  of  the  Z's. 

It]  follows  that  the  coefficient  of  each  separate  I  in  each  of  these 

equations  is  zero.     Collecting  these  coefficients  and  equating  them 

to  zero,  we  obtain  three  sets  of  n  equations  each  : 

[aa]  «!  +  [ah]  0i  +  [ac]  ji-  ai  =  0\ 

[aa]  ofa  +  [ah]  13^  +  [ac]  7^  -  aa  =  0  1 (Aj), 


[ah]  a,  +  [hh]  /3,  +  [he]  y,-h,  =  0 


[ac]  Gi  +  [he]  /5i  +  [cc]  7i  —  Ci  =  0 


.(A.), 


.(A.). 


Multiplying   the    equations   in  each   set  by   ai,  a^,  etc.,  and 
adding,  we  obtain  the  three  equations  : 

[aa]  [aa]  +  [ah]  [a0]  +  [ac]  [ay]  -  [aa]  =  0, 
[ah]  [aa]  +  [hh]  [a/3]  +  [he]  [ay]  -  [ah]  =  0, 
[ac]  [a2]  +  [be]  [ajS]  +  [ce]  [ay]  -  [ac]  -  0. 
This  is  a  set  of  three  equations  homogeneous  in  three  unknowns, 
[aa]  —  1,  [a/3],  and  [a7],  having  no  relation  between  the  coeffi- 
cients ;  i.e.  they  are  three  independent  equations.     It  follows  that 
each  variable  must  vanish. 

M=l,     [a/3]  =  0,     [a7]  =  0| 
Similarly  it  might  be  shown  that  | 

[ba]=0,     [6/3]  =  1,     [67]  =  0 

[ca]  =  0,     [c/3]  =  0,     [C7]  =  l) 


•(B). 
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.(C). 


Now  multiply  equations  (Aj)  by  a^,  a^,  etc.,  and  add 

[cm]  [aa]  +  [ab]  [a^]  +  [ac]  [cxy]  =  [aa]  =  1  ^ 
Similarly  from  (A2)  and  (A3), 

[ab]  [aa]  +  [66]  [a/3]  +  [6c]  [a7]  =  [6a]  =  0 

[ac]  [aa]  +  [6c]  [a/3]  +  [cc]  [ay]  =  [col]  =  0 1 

The  equations  (CO  involve  three  unknowns  [oa],  [afi],  [a7].  It 
should  be  noted  that  they  are  the  same  as  the  original  set  of 
normal  equations,  with  [aa],  [a^],  [a7]  substituted  for  os,  y,  z,  and 
with  [al]  =  1,  [6^]  =  [cl]  =  0.  Hence  we  may  derive  immediately 
the  rule  for  finding  [aa]  already  given  in  §  45  above. 

[aa]  is  the  value  of  x  derived  from  the  normal  equations  when 

[al]  =  l,     [bl]  =  0,     [d]  =  0. 
If  equations  (A)  were  multiplied  by  /3i,  /^a,  etc.,  and  then  by 
7i,  72,  etc.,  we  should  derive  two  sets  of  equations  (C2),  (C3),  similar 
to  (Cl),  from  which  we  should  derive  the  values  of  [/3/3]  and  [77]. 
[/3^]  is  the  value  of  y  derived  from  the  normal  equations  when 
[al]  =  0,     [bl]  =  l,     [cl]  =  0; 
and  [77]  is  the  value  of  2  derived  from  the  normal  equations  when 
[al]  =  0,     [6^]  =  0,     [cl]  =  1. 

52.     Alternative  Proof  of  the  Formula 


r  =  0-6745  yi£^ 


m 

For  the  sake  of  simplicity  in  writing  we  shall  consider  the  case 
of  three  unknowns  ;  or  m  =  3. 

The  observational  equations,  n  in  number,  are 

choo  +b,y  +  CiZ-l^  =  v^  =  ttj  [al]  +  61  [/3l]  +  c,  [yl]  -  k, 

a^x  +  b^y  +  c^z  -l^  =  v.,  =  a^  [al]  +  62  [131]  +  c^  [7/]  -  l^, 

etc. 

V,  =  k  (cha,  +  6iygi  +  Ci7i  -  1)  +  ^2  («ia2  +  ^i/^s  +  C17,)  +  etc., 

Vo  =  li  (ws^i  +  ^2 A  +  C271)  +  k  (ct-f^i  +  b.^02  +  c.27_>  -  1)  +  etc., 

etc. 

If  the  observations  were  perfect,  the  values  of  l-^,  L,  etc.,  would  be 

absolutely  accurate,  and  the  residuals  v^,  v^,  etc.,  would  all  be  zero. 

Let  dl^,  dlo,  etc.,  be  the  errors  in  l^,  l^,  etc.     Then  we  may  write 

Vl  =  (a^ai  +  6ii3i  +  Cl  7i  -  1)  dl^  +  (a^a.^  +  6i/5o  +  C170)  dl.^  +  etc. 

8—2 
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The  quantities  ^i,  l^,  etc.,  are  all  determined  independently,  and  so 
the  errors  dl^,  dl^,  etc.,  are  all  independent,  though  their  mean 
square  errors  are  all  equal.     Let  this  M.s.E.  be  /x. 
Then  the  mean  value  of  v-c  is  given  by 


=  fM^  {a,^  [aa]  +  b,^  [/8yS]  +  cr  [77]  +  2a A  [a/3]  +  2a,  c,  [ay] 

+  2b,c,  [I3y]  -  2  (a,a,  +  6,  A  +  c,y,)  +  1}. 
We  may  repeat  this  process  for  each  residual.     Then  we  obtain 
the  equation 

[vv]  =  /.^  \[aa]  [aa]  +  [bb]  [/S^]  +  [cc]  [77]  +  2  [ab]  [a/3] 

+  2  [ac]  [^7]  +  2  [be]  [I3y]  -2[aa  +  b^-\-  cy]  +  n} 
=  /^'  l[«f^]  [«a]  +  [<^^]  [«/3]  +  [etc]  [a7] 

+  [ab][afi]+[bb][m+M[M 
+  [ac][a7]  +  [6c][/37]+[cc][77] 

-2[aa]-2[bB]-2[cy]  +  n], 
and  referring  back  to  equations  (B)  and  (C)  above,  we  find 

[vv]  =  fi'  [n  +  3  -  6}  =  /x-  {n  -  3). 
Whence  we  obtain  the  usual  equation 

r  =  0-6745/x  =  0-6745  a/  ^^. 
V   n—  o 

For  the  case  of  m  unknowns,  the  equation  derived  above  would  be 

[vv]  =  /jl^  {711  —  2m-\-  n]  =  /jl~  {n  —  m), 
and  then 

,=  0-6745  ./i^    or   0-6745  ./EM  . 
\    n  —  7)1  V    n  —  711 

The  working  is  easily  modified  to  apply  to  the  case  where  the 

observational  equations  have  unequal  weights. 

53.     Probable   Error  of  a  Function  of  the  Unknowns. 

Let  the  given  function  be/(X,  Y,  Z).     As  in  previous  work,. 
!ihe  function  /  may  be  reduced  to  a  linear  form, 

fix,  Y,Z)=fiX.Y.Z,)^^^.^lly^l^^., 
where  X  =  Xq  +  x,     F=  Yq  +  y,     Z=  Zq  +  z, 
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Xq,  Fo>  ai^d  Zq  being  approximate  values  of  the  unknowns.  Since 
the  errors  of  x,  y,  z  are  not  independent,  we  cannot  apply  to  the 
last  equation  the  reasoning  of  §  22.     With  the  usual  notation, 

and  '^/=(afjM+3^J^^]+|M 

Now  the  P.E.'s  of  all  the  ^'s  are  the  same,  being  the  p.e.  r  of  an 
observational  equation  of  unit  weight.  Hence  if  ?>  be  the  P.E.  of 
the  given  function  /,  we  may  write 


=  .JM©V[^«®VMGfJ 


It  should  be  noted  that  in  general  it  is  not  correct  to  write 

The  R.H.S.  of  the  last  equation  is  equivalent  to  the  first  three  terms 
on  the  R.H.S.  of  the  previous  equations.  In  general  the  other 
terms  do  not  vanish. 

A  complete  solution  of  a  set  of  normal  equations  includes  the 
determination  of  the  weights  of  the  unknowns;  i.e.  it  requires  the 
evaluation  of  [aa],  [yS/3],  [77].  If  the  method  of  the  table  on 
page  110  above  is  followed,  the  values  of  \_ol^\  [^^\  [a7]  are  also 
derived  in  the  course  of  the  complete  solution.  It  is  thus  prac- 
ticable to  evaluate  rp  from  the  equation  derived  here. 

54.  Normal  Place  Method  in  Formation  of  Observa- 
tional Equations. 

It  sometimes  happens  that  a  series  of  observations  are  such 
that  referred  to  one  of  the  variable  circumstances,  say  the  time, 
they  cluster  in  groups  round  certain  values  of  that  variable,  wdth 
well-marked  gaps  between  successive  groups.     In  such  a  case  there 
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is  generally  no  appreciable  loss  of  accuracy  in  the  resulting  solution 
if  the  mean  of  each  group  is  taken,  and  associated  with  the  mean 
value  of  the  variable  (time).  Such  a  mean  value  is  called  a 
normal  place.  The  use  of  the  normal  place  instead  of  the  indi- 
vidual observations  reduces  the  number  of  observational  equations, 
and  produces  a  considerable  saving  in  time  and  labour  in  the 
formation  of  the  normal  equations.  For  example,  if  observations 
of  a  planet  are  made  at  intervals  of  a  fortnight,  it  is  perfectly 
legitimate  to  take  the  mean  of  one  night's  observations  to  form  a 
normal  place  for  working  out  the  elements  of  the  orbit  of  the 
planet;  provided  that  the  orbit  of  the  planet  is  not  affected  by 
any  disturbing  cause  of  short  period. 

55.  Testing  the  Results  of  the  Least  Square  Solution 
for  Unusual  Errors,  and  for  Systematic  or  Constant  Errors. 

When  the  normal  equations  have  been  solved,  yielding  the 
values  of  the  unknowns,  the  next  step  in  the  work  is  to  form  the 
residuals  by  substituting  the  derived  values  of  the  unknowns  in 
the  observational  equations.  The  sum  of  the  squares  of  the 
residuals  will  be  required  in  evaluating  the  probable  errors,  and 
even  if  it  should  be  convenient  to  evaluate  this  sum  by  any  other 
method,  the  value  of  \yv\  derived  directly  from  the  observational 
equations  affords  a  useful  check  upon  the  work  of  solution.  Again 
it  may  happen  that  some  of  the  observations  are  affected  by 
sources  of  error  not  present  in  the  other  observations.  The  effect 
of  such  unusual  errors  would  be  to  yield  unusually  high  values  to 
the  corresponding  residuals.  If  some  of  the  residuals  should  be 
large  in  comparison  with  the  probable  errors,  it  may  be  advisable 
to  reject  the  observations  which  yield  the  high  residuals.  The 
problem  of  the  rejection  of  observations  will  be  considered  more 
fully  in  Chapter  YIII.  Meanwhile  we  note  in  passing  that  the 
evaluation  of  the  residuals  is  an  essential  stage  in  the  process  of 
testing  the  value  of  the  different  observations.  Further  it  should 
be  noted  that  if  some  of  the  observations  are  rejected,  it  is  necessary 
to  repeat  the  work  of  solution,  as  the  coefficients  in  the  normal 
equations  are  all  slightly  modified.  The  values  of  the  unknowns 
obtained  in  the  first  solution  may  be  taken  as  approximate  values, 
for  which  the  new  normal  equations  will  yield  corrections. 
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All  the  methods  of  solution  hitherto  considered  are  based  on 
the  assumption  that  the  observations  are  affected  only  by  accidental 
errors,  all  constant  and  systematic  errors  having  been  removed. 
The  distinction  between  constant,  systematic,  and  accidental  errors 
has  already  been  discussed  in  Chapter  I  above.  It  was  suggested 
that  constant  and  systematic  errors  should  be  eliminated  either  by 
changes  in  the  method  of  solution,  or  by  empirical  corrections 
deduced  from  special  observations  designed  to  determine  the  exact 
law  of  such  errors.  In  practice,  however,  it  is  seldom  possible  to 
eliminate  all  the  constant  and  systematic  errors,  simply  because 
we  can  never  know  the  nature  of  all  the  errors  to  which  an 
observation  is  subject.  Errors  of  theory  give  rise  to  incorrect 
coefficients  in  the  observational  equations,  and  these  in  turn  enter 
into  the  coefficients  of  the  normal  equations,  and  so  affect  the 
values  of  the  unknowns.  And  when  the  values  of  the  unknowns 
are  substituted  in  the  normal  equations,  the  errors  of  theory  affect 
the  values  of  the  residuals,  and  appear  in  effect  as  systematic  errors. 

It  is  even  more  important  to  consider  the  systematic  errors 
than  the  accidental  errors,  since  the  latter  are  eliminated  by  mere 
repetition,  or  by  the  mere  increase  in  the  number  of  observations. 
The  methods  of  observation  should  be  so  arranged  as  to  avoid 
systematic  errors,  as  far  as  possible,  or  to  provide  corrections  for 
such  systematic  errors  as  are  not  eliminated  by  the  methods  of 
observation.  Finally,  the  residuals  may  be  made  to  yield  in- 
formation concerning  the  presence  of  systematic  errors.  A  number 
of  methods  suggest  themselves. 

1.  A  comparison  of  the  actual  error  curve  with  the  theoretical 
error  curve.  Thus  in  §  24,  Examples  4  and  5,  the  curves  of  actual 
error  show  a  well-marked  deviation  from  the  form  of  the  normal 
error  curve,  so  suggesting  the  presence  of  a  systematic  error.  In 
Example  4  it  was  possible  to  suggest  a  plausible  explanation  of 
this  error. 

The  clustering  of  points  in  the  diagram  representing  the 
observations  would  also  point  to  the  presence  of  a  systematic  error. 

2.  If  the  residuals  show  a  tendency  to  have  a  certain  sign 
when  a  certain  set  of  conditions  exist,  and  the  opposite  sign  when 
these  conditions  are  absent,  or  when  other  conditions  exist,  the 
result  may  be  ascribed  to  systematic  error. 
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3.  When  the  observations  extend  over  a  long  period  of  time, 
the  residuals  should  be  arranged  in  order  of  date  of  observation. 
If  the  residuals  so  arranged  follow  a  systematic  law  of  variation,  a 
systematic  error  may  be  expected.  For  example,  an  observer 
knowing  nothing  of  the  aberration  of  light,  would  find  differences 
between  the  observed  places  of  stars  in  different  months.  But  if 
the  observations  of  one  star  extended  over  a  number  of  years,  and 
the  errors  were  arranged  in  order  of  date  of  observation,  it  would 
be  found  that  the  error  for  any  one  star  was  the  same  at  the  same 
time  of  the  year.  The  systematic  error  could  then  be  made  an 
object  of  prediction,  and  would  cease  to  be  an  error. 

Again  if  both  the  residuals  and  certain  of  the  conditions  of  the 
observations,  say  the  temperature,  be  arranged  in  order  of  date  of 
observation,  any  correspondence  betw^een  the  variations  of  the 
residuals  and  the  conditions  considered  would  indicate  a  systematic 
error  due  to  those  conditions. 

4.  A  comparison  of  the  results  of  solution  with  those  of  an 
independent  set  of  observations  made  under  different  conditions, 
or  by  different  methods,  may  help  to  determine  the  presence  of 
constant  or  systematic  errors. 


MISCELLANEOUS  EXAMPLES. 

1.  The  Hartmann-Cornii  formula  for  the  reduction  of  prismatic  spectro- 
grams ■^. 

The  usual  method  of  reducing  spectrograms,  i.e.  of  measuring  the  wave- 
lengths of  lines  in  the  spectrum  by  the  use  of  lines  of  known  wave-length, 
is  that  due  to  Hartmann.  If  n  be  the  measured  scale-reading  of  a  line  of 
wave-length  X,  n  and  X  are  connected  by  the  formula 

Q 

n-no= (1), 

(X-Xor 

where  Uq,  Xq,  and  c  are  constants  for  each  plate,  and  a  is  constant  for  all 
plates  taken  with  the  same  instrument.  The  value  of  a  being  known  for  the 
particular  instrument  used,  Uq,  Xq,  and  c  can  be  approximately  determined 
by  measuring  the  positions  of  three  lines  of  known  wave-length.  The  values 
of  these  constants  may  then  be  improved  by  measuring  a  number  of  lines, 
say  12,  and  giving  the  constants  such  values  as  will  afford  the  best  fit  to  the 
12  lines. 

*  Vide  Monthly  Notices,  R.A.S.,  Vol.  lxxi,  p.  663,  Stratton. 
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Corresponding  to  a  line  of  wave-length  X^  the  scale-reading  should  be 

^r=^o+:^     \  ,a    (^)- 

(A,.-Ao) 

If  this  equation  is  not  satisfied,  let 


'="'-(""+^«) ^'^- 


8n 

We  have  to  find  corrections  to  the  constants  no,  Xq,  and  c,  so  that  equation 
{2)  shall  be  satisfied. 

The  equations  to  be  satisfied  for  the  12  standard  lines  are 

„,=  (,,,  +  ?«„)+  1  +  ^'^  (.  =  1.  2,  ...,  12) 

{X,.-(Xo  +  OAo)} 

^  c  cc  acd^Q 

(x,-xor  (x,-Xor  (x.-Xor^i 

Making  use  of  equation  (3)  we  may  write  these  equations  as 

These  form  a  set  of  12  observational  equations  which  may  be  solved  by 
the  methods  of  least  squares.     The  normal  equations  are 

12        X  12         ac  12 

12?Wo  +  2 ?c  +  2  — p,  SXo  =  2  dur 

i(X,-Xor  i(X.-Xor+' 

12  1  .         12  ac  .,         12         d7lr 

i(X,-Xo)''^  i(X,-Xo)''^-^^  i(X,-Xor 

12  „2c2  12         .^;^,,„ 

+  2., -:.-rx7>9Xo=2 


.(X,-Xo)''^+''  i(X,-Xor'-' 

In   these   equations,    all   the   quantities    which    occur    are    known,    except 
driQ,  8c,  8Xo,  for  which  these  equations  have  to  be  solved. 

In  practice,  it  is  found  that  when  the  normal  equations  are  solved,  the 
weights  of  the  corrections  which  they  yield  are  very  small,  i.e.  the  corrections 
are  very  badly  determined.  It  is  customary  to  assume  Xq  to  remain  constant, 
so  that  8Xo=0.  The  normal  equations  then  yield  corrections  dii^  and  cc  which 
are  well-determined. 

2.     Position  of  the  sun's  axis''^. 

In  an  investigation  on  the  position  of  the  sun's  axis,  Dyson  derived  his 
observational  equations  in  the  form 

X  cos  Or -^2/  '"^il^  ^r  +  ~  =  ^ r » 

where  0  had  one  of  thirteen  values,  0\  ±ld%  ±30°,  ...,etc.,  ±90".     Each 
observed  quantity  l^  was  subject  to  the  same  probable  error  e. 

*  Dyson,  Monthly  Notices,  E.A.S.,  Vol.  lxxii,  p.  564. 


122 


SOLUTION   OF  THE   NORMAL   EQUATIONS 


[CH. 


The  normal  equations,  derived  in  the  usual  way,  are 

^2  cos2^  +  ?/2  sin  ^cos^4-5;2cos^  =  2^cos^,  p.e.  eV^cos^^, 


^2  sin  ^  cos  ^ +3/2  sin^^ 4- 22  sin  ^  =  2^  sin  ^,  p.e.  e\/-2sin^6, 
j;2cos6-\-7/2sm6  +  zl3  =  '2l,  p.e.  e /v/Ts, 

or,  evaluating  the  trigonometric  coefficients, 

6x-\-7-64:Z  =  a,  P.E.  €\/6, 

7y  =  ^,    P.E.    eV'T", 
7-64^*7+132  =  7,  P.E.   e\/T3. 

Thus  y  is  obtained  with  a  p.e.  4= ,  but  it  is  found  from  the  actual  solution 

V7' 

that  jff  is  determined  with  a  p.e.  2* le,  so  that  a;  is  not  by  any  means  as  clearly 
determined  as  7/. 

The  chief  value  of  the  investigation  leading  to  the  above  set  of  normal 
equations  therefore  consists  in  the  determination  of  the  quantity  y,  which  is 

given  by  the  least  squares  solution  with  a  p.e.  ~_  or  -378^.     It  is  of  con- 

^/7 
siderable  importance  to  consider  how  closely  this  value  can  be  obtained 
without  going  through  the  least  squares  solution. 

Writing  down  the  13  observational  equations  in  order,  and  subtracting 
the  13th  from  the  1st,  the  12th  from  the  2nd,  the  11th  from  the  3rd,  etc.,  we 
obtain  the  equations 

y  =  |(Zl-^13), 

•966y  =  1(^2-^12), 

•8663/ =  1  (^3 -^n), 
etc. 


Each  of  these  equations  has  a  p.e. 


\/2 


Adding  the  first  two  of  these  equations,  we  obtain  l-966y  with  a  p.e. 


\/2 


V2. 


Adding  the  first  three,  we  obtain  2-832y  with  a  p.e.  -j^  a/S  ;  and  so  on. 

In  this  way  y  is  determined 

with  p.e.  '7076  using  one  equation  at  each  end  of  the  series  of  thirteen, 

•5096     „     two  equations  „  „ 

•432e     „     three 

•4096     „     four 

•3926     „     five 

•3966     „     six  „  „  „ 

•378f     „     least  squares  solution. 
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By  using  five  equations  at  each  end  of  the  series  we  thus  obtain  a  very- 
close  approximation  to  the  accuracy  of  a  least  squares  solution,  while  four 
equations  at  each  end  only  increase  the  p.e.  from  'SSe  to  -416.  The  use  of 
four  equations  at  each  end  of  the  series  yields  results  of  a  high  degree  of 
accuracy,  and  affords  a  considerable  saving  of  labour. 

The  above  set  of  equations  illustrates  a  fact  which  has  wide  applications. 
If  in  the  observational  equations  two  of  the  unknowns  appear  with  coefficients 
which  are  more  or  less  related,  e.g.  always  have  the  same  sign  (as  in  the  case 
of  the  coefficients  of  x  and  z  in  the  above  equations),  their  weights  in  the 
solution  will  be  small.  If  the  coefficient  of  z  were  always  twice  that  of  x 
we  could  not  determine  either  separately. 

3.  Solve  the  normal  equations  derived  above  for  x,  y,  z,  and  find  the 
weights  of  X  and  z.     Verify  the  statement  that  the  p.e.  of  x  is  2*16. 

4.  Why  is  it  not  legitimate  to  proceed  as  follows  in  Example  2  ? 

13  (6.1^+  7 -64^)  -  7-64  (7-64^  -f  13^) 

has  P.E.  made  up  of  13e  >JQ  and  7*6ie  ^13. 

Therefore  p.e.  of  (78-7-642)^  is  ^(13^.  6  +  7-642 .  13),  whence  the  p.e.  of 
X  can  be  deduced. 


CHAPTER   VII 

THE     ADJUSTMENT     OF     CONDITIONED     OBSERVATIONS 

56.  When  the  quantities  measured,  or  the  unknowns  which 
they  involve,  are  not  independent,  but  are  connected  a  priori  by 
certain  relations  which  must  be  satisfied  by  the  adjusted  values, 
the  methods  of  Chapters  V  and  VI  are  not  directly  applicable. 

Let  there  be  n  directly  observed  quantities  M^,  Mc^,...,Mn, 
of  weights  pi,  p.2,  ■•',Pn,  and  let  the  most  probable  values  of  the 
observed  quantities  he  L^,  L^,  ...,  Ln. 

Then,  if  Vi,  v.,,  etc.,  be  the  residuals,  we  have  the  relations 

Zi  -  ifi  =  v/ 

L,^-M^  =  V2  \ (i). 

etc. 

In  Chapters  V  and  VI  it  was  supposed  that  the  quantities 
Xi,  Lo,  etc.,  could  be  accurately  represented  as  functions  of  certain 
unknown  quantities  X,  Y,  Z,  etc.,  m  in  number.  The  same  method 
might  be  applied  to  the  problem  we  now  have  to  consider.  In 
general  the  a  priori  relations  mentioned  above  are  expressible  as 
explicit  functions  of  L^,  L.j,  etc.,  and  consequently  as  explicit 
functions  oi  v^,  V2,  etc.;  and  it  is  more  convenient  to  regard 

as  71  unknown  quantities,  which  are  connected  by  a  certain  number 
of  conditions,  or  functional  relations.  Let  these  relations,  m'  in 
number,  be  reduced  to  linear  form  (cf  page  76),  and  written 

h-iVi  +  h2V2  +  h^Vs  +  ...  —li  =  0 

kiVi  +  ^2^2  +  h^s  +  ...-^2  =  0     (ii), 
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where  the  coefficients  h^Jio,  k^,  k.2,  l^,  L,  etc.,  are  all  known  quantities. 
The  least  square  theory  requires  that  [pvv]  shall  be  a  minimum 
subject  to  the  conditions  represented  by  equations  (ii).  There  are 
two  methods  of  effecting  this. 

57.     Direct  Solution  by  Substitution. 

By  the  use  of  equations  (ii)  it  is  possible  to  express  ni  of  the 
unknown  residuals  as  linear  functions  of  the  remaining  n  —  m' 
residuals.  Substituting  the  values  thus  obtained  in  [pvv]  we 
obtain  an  expression  involving  only  n  —  m  independent  unknowns. 
Differentiating  this  expression  with  respect  to  each  of  these  inde- 
pendent unknowns  in  turn,  we  obtain  n  —  m'  linear  equations 
whose  solution  yields  the  value  of  the  n  —  m  independent  un- 
knowns. The  remaining  m  residuals  can  be  evaluated  by  the 
use  of  the  expressions  first  deduced  from  the  equations  (ii).  The 
residuals  are  then  all  known. 

Example  1.     The  Adjustment  of  Coplanar  Angles. 

The  following  are  the  observed  values  of  four  coplanar  angles.     What  are 
the  most  probable  values  of  the  angles  ? 

AOB=  80°  13' 10"  weight  3, 

BOC=   83°  18'    8"        „        4, 

COD=   72°    6'    4"       „       2, 

D0A  =  12r2l'l6"       „       2. 

Sum  =359°  58' 38". 

Let  X,  J/,  z,  w  be  the  corrections,  measured  in  seconds,  to  be  added  to  the 
measured  values  of  these  angles. 

Then  359°  58'  38"  -\-{.v-\-2/  +  z  +  iv)"  =  360°, 

or  x-\-y  +  z+ 10  =  ^2 (1). 

This  conditional  equation  must  be  rigorously  satisfied  by  the  adjusted 
values  of  the  angles. 

The  observational  equations  are 

^=0  weight  3' 
y  =  0        „       4 

.=  0        „        2^     (2)- 

w=0        „       9 

Substituting  for  w  from  equation  (1),  we  replace  the  last  of  equations  (2)  by 
-.r-?/-0-H82  =  O  weight  2  (3). 
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The   normal   equations   derived  from   the  first  three   equations   in   (2), 

and  equation  (3),  are 

5.r+2y  +  2s=164, 

2^ +  6j/  + 2^  =  164, 

2.r+2y  +  4s=164. 

Subtracting  the  second  equation  from  the  first,  and  the  third  from  the 
second,  we  find 

Whence  we  find 

^._528         ,/_246  ^_492  ,,,_492 

or  .r=17",     J/ =  13",     2=26",      ?<?=26". 

The  adjusted  values  of  the  angles  are  therefore 
AOB=  80°  13' 27", 
BOC=  83°  18' 21", 
COD=  72°    6' 30", 
2)0^  =  124°  21' 42". 

Example  2.     The  three  angles  of  a  coplanar  triangle  are  all  equally  well 
observed.     To  find  the  most  probable  values  of  the  angles. 

Let  A,  B,  C  be  the  three  angles,  a,nd  e  the  spherical  excess  of  the  triangle. 

Then  A+B  +  C=180°  +  €. 

Let  i/j,  i/2,  i/3  be  the  observed  values  of  the  angles,  and  let 

A=3f^  +  v^, 

B  =  M2  +  V2, 

Then  the  problem  is  to  find  the  most  probable  values  of  the  residuals  Vi,  Vg,  v^, 

Vi  +  ^2 + Vo  =  1 80°  +  e  -  J/i  -  J/2  -  J/3 = «,  say. 
The  least  square  theory  requires  that 

V  +  V  +  ^3^ 

shall  be  a  minimum  subject  to  the  condition 

Vi-\-v.2  +  V3=a. 
Substituting  for  Vg,  we  find  that 

Vi^  +  V2^  +  {a-Vi-V2Y 

must  be  a  minimum. 

Differentiating  with  respect  to  i\  and  V2  in  turn,  we  obtain  the  equations 

2vi  +  V2=a, 
Vi-\-2v2=a. 


Whence  we  deduce 


^1  =  ^2  =  3  =  ^3. 
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58.    Method  of  Undetermined  Multipliers  or  Correlates. 

The  method  of  direct  substitution  is  only  practicable  when 
the  relations  expressed  by  equations  (ii)  are  few  in  number  and 
simple  in  form.  In  other  cases  the  minimum  value  of  [pvv]  can 
best  be  found  by  the  method  of ''Undetermined  Multipliers." 

Equations  (ii)  are  multiplied  by  —2Aj,  —2A2,  etc.,  and  the 
products  are  added  to  [pvv],  yielding  the  expression 

[pvv]-2A,(h,v,-i-h2V2+...-li)-2A,{k,v,+Lv,-\-...-l^etc.{m). 
We  then  proceed  to  find  the  values  of  v  which  will  make  this 
expression  a  minimum.     Differentiating  with  respect  to  v^,  v.^,  etc. 
in  turn,  we  obtain  a  system  of  n  equations : 

p,v^  =  A^h^  +  A.l\-\-...\ 

p2V2  =  AJi.2  +  A2ko+  ...  \ (iv). 


Substituting  these  values  of  v^,  v^,  etc.  in  the  equations  (ii),  we 
obtain  a  system  of  111  equations  : 


.(V), 


A, 

~hh' 

+  A.2 

"hk~ 
^P_ 

A, 

~hk' 
~P. 

+  A, 

~kk~ 

+  ...=/. 

.P. 

=  —  +  —+. ..etc.. 
Pi      P2 

'hk~ 
.P. 

Pi 

^K 

? 

k. 

—  +  . . .  etc. 

h 

where 


These  m  equations  involve  the  m'  correlates  A^,  A.,,  etc.  as 
unknowns,  and  the  solution  yields  the  values  of  the  correlates. 
Substituting  the  values  so  obtained  in  equations  (iv),  we  obtain 
the  most  probable  values  of  the  residuals  v^,  v^,  etc.  Equations  (v) 
are  called  the  normal  equations  for  the  correlates  A^,  A^,  etc. 

Example.     Example  1  above  might  be  worked  out  by  this  method.     We 
should  have  to  make 

3^2  +  4^2  _,.  2^2  ^_  2^^2  _  2  J  (^-  _|-y  ^  2  +  ^^  _  82) 
a  minimum. 

Differentiating  with  respect  to  .r,  y,  2,  ii\  in  turn,  we  find 

^x  =  A=^y  =  '2.z  =  2iu. 

Since  x-\-y  +  z-\-w  =  ^2, 

we  immediately  deduce  the  same  values  of  x,  y,  z^  w  as  were  given  above. 
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59.     The  Precision  of  the  Unknowns. 

It  was  shown  on  page  100  that  in  a  system  of  ?i  observations 
involving  m  independent  unknowns,  the  p.e.  of  an  observational 
equation  of  unit  weight  is 

r  =  0-6745     /f^^ 


n  —  m 

In  the  case  considered  in  this  chapter  there  are  n  unknowns^ 
connected  by  in  relations.  But  ^ve  can  use  the  m  relations  (ii)  to 
eliminate  ni  of  the  unknown  quantities,  so  that  we  shall  have 
n  observed  values,  involving  n  —  m  independent  unknowns.  We 
may  then  use  the  formula  given  above  for  evaluating  r. 

But  rn  =  n  —  m    or  n  —  m  =  ni, 

,.   r  =  0-6745  ./G^. 

V       in 

The  P.  E.  of  a  residual  Vg  of  weight  ps  is 
,,  =  0  6745,/^. 


For  a  full  development  of  the  subject  of  this  chapter,  and  in 
application  to  survey  work  in  particular,  the  reader  is  referred  to 

Wright  and  Hayford,  Adjustment  of  Observatio7is,  Chapters  v,  vi,  and  vii. 
F.  K.  Helmert,  Ausgleichungsrechnung,  Chapter  iv. 

Oscar  S.  Adams,  Application  of  Theory  of  Least  Squares  to  the  Adjustment 
of  Triangidation.  U.S.  Coast  and  Geodetic  Survey,  Special  Pubhcation 
No.  28. 

Jordan,  Handhuch  der  Vermessungskunde^  Bd,  i. 


CHAPTER   VIII 

THE    REJECTION    OF    OBSERVATIONS 

60.  An  observer  making  a  series  of  observations  of  any  kind 
has  power  to  reject  any  observation  if  he  is  certain  that  it  is 
vitiated  by  some  unusual  sources  of  error  which  do  not  affect  the 
other  observations  in  the  series.  To  put  this  in  other  words,  the 
observer  is  to  a  certain  extent  free  to  choose  the  time  for  making 
his  observations  so  that  the  external  conditions  shall  vary  as  little 
as  possible  during  the  series.  He  is  supreme  in  his  own  depart- 
ment, having  the  power  to  retain  or  reject  observations  according 
to  his  judgment  of  the  extent  to  which  the  external  causes  of  error 
react  upon  his  measurements. 

But  when  the  observational  material  is  put  into  the  hands  of 
the  computer  there  arises  a  new  question.  Shall  the  computer  be 
alloAved  to  reject  any  observation  whose  residual  is  much  larger 
than  those  of  the  remaining  observations  ?  He  should  clearly  be 
allowed  to  reject  an  observation  when  he  is  convinced  that  it  is 
affected  by  an  error  from  some  unusual  source,  which  does  not 
affect  the  other  observations  in  the  series ;  or  if  the  observation  is 
clearly  spoiled  by  some  definite  blunder,  such  as  the  mis-reading 
of  a  scale  by  five  divisions.  It  is  sometimes  possible  to  correct 
blunders  of  this  type,  and  to  retain  the  observation,  but  the 
greatest  caution  is  necessary  in  making  such  corrections.  The 
real  difficulty  lies  in  deciding  whether  a  large  residual  is  due  to 
some  unusual  source  of  error,  or  is  due  to  the  chance  occurrence 
of  a  large  number  of  small  accidental  errors  with  the  same  sign. 
If  the  latter  alternative  be  true,  the  large  residual  is  in  accordance 
with  the  law  of  error,  and  its  rejection  will  decrease  rather  than 
increase  the  accuracy  of  the  final  result. 

B.  o.  9 
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A  number  of  criteria,  based  on  more  or  less  rigid  analysis,  have 
been  put  forward  by  various  writers.  The  best  known  of  these  is 
Peirce's  Criterion.  The  underlying  principle  is  that  the  doubtful 
observations  should  be  rejected  when  the  probability  of  the  system 
of  errors  obtained  by  retaining  them  is  less  than  the  product  of 
the  probability  of  the  system  of  errors  obtained  by  rejecting  them, 
multiplied  by  the  probability  of  making  so  many,  and  no  more, 
abnormal  observations.  As  this  criterion  is  not  in  general  use, 
and  is  rather  tedious  in  its  application,  we  shall  not  enter  into  the 
full  proof.  The  reader  who  desires  to  know  more  of  it  is  referred 
to  Chauvenet's  Theoretical  and  Practical  Astronomy,  Vol.  ii, 
Appendix,  §  58,  where  Peirce's  proof  is  reproduced  almost  word 
for  word ;  and  to  the  proof  and  tables  by  Gould  in  Astronomical 
Jouriial,  Vol.  iv,  and  U.S.  Coast  and  Geodetic  Survey  Report,  1854, 
pp.  131,  132*. 

Chauvenet  (Vol.  ii)  gives  a  criterion  "  for  the  rejection  of  one 

doubtful  observation."     The  probability  that  an  error  is  less  than 

t  is,  as  we  have  already  seen, 

t  p^ 

_L  I  \-t'dt==S(t)=^[''  e-^'dt. 
VTr.'o  vttJo 

Of  a  series  of  m  observations,  the  number  whose  errors  may  be 

expected  to  be  less  than  t  will  be  7n  ©  {t),  and  the  number  which 

will  exceed  t  will  be  m  (1  —  ©  (t)).     If  this  last  quantity  be  less 

than  ^,  then  it  follows  that  an  error  greater  than  t  has  a  greater 

probability  against  it  than  for  it,  and  may  therefore  be  rejected. 

The  limiting  error  t  which  may  be  rejected  is  therefore  given  by 

m(l-0(O)  =  i  or  @(0  =  ^™^\ 

The  function  S{t)  has  been  tabulated  on  page  19. 

The  application  of  the  criterion  by  means  of  the  formula  given 
above  is  extremely  simple.  Suppose  we  have  a  series  of  100 
observations.     Then 

^^^^^  =  •995. 

2m 

Referring  to  page  19  we  find  this  value  for  ©  (t)  when  -  =  4*2.     The 

*  See  also,  "Note  on  Peirce's  Criterion,"  S.  A.  Saunder,  Monthly  Notices,  R.A.S., 
Vol.  Lxiii,  p.  432. 
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limiting  value  of  t  is  4'2  times  the  probable  error.  So  that  if  one 
of  a  series  of  100  observations  has  a  residual  greater  than  4*2  times 
the  probable  error  of  the  series,  Chauvenet's  criterion  would  reject 
that  observation. 

Chauvenet's  criterion  only  considers  the  rejection  of  one 
observation,  but  when  one  has  been  rejected,  the  rule  may  be 
applied  to  consider  the  rejection  of  another,  and  so  on.  The 
criterion  is  simple  and  easy  to  apply,  but  it  is  probably  too 
sweeping.  It  does  not  allow  sufficiently  for  the  possible  presence 
of  a  large  number  of  small  accidental  errors  of  the  same  sign. 

Other  criteria  have  been  suggested,  but  few  of  them  are  really 
useful  in  practice.  The  whole  question  of  the  possibility  of 
rejecting  observations  on  the  ground  of  theoretical  discussion 
based  on  residuals  only,  has  given  rise  to  a  considerable  amount  of 
controversy.  Bessel  opposed  the  rejection  of  any  observation 
unless  the  observer  was  satisfied  that  the  external  conditions 
produced  some  unusual  source  of  error  not  present  in  the  other 
observations  of  the  series.  Peirce's  criterion  was  at  an  early  date 
subjected  to  very  severe  criticism.  Airy*  claimed  that  it  was 
defective  in  foundation,  and  illusive  in  its  results.  He  maintained 
that,  so  long  as  the  observer  was  satisfied  that  the  same  sources  of 
error  were  at  work,  though  in  varying  degrees,  throughout  a  series 
of  observations,  the  computer  should  have  no  right  to  reject  any 
observation  by  a  discussion  based  solely  on  antecedent  probability. 
An  observation  should  be  rejected  only  when  a  thorough  exami- 
nation showed  that  the  causes  of  error  normally  at  work  were  not 
sufficient  to  produce  the  error  in  the  doubtful  observation.  Airy 
also  cited  a  case  where  the  rejection  of  the  observations  having 
large  residuals  led  to  poor  results.  In  the  preceding  century,  at 
a  time  when  the  figure  of  the  earth  was  not  well  determined, 
azimuth  observations  were  made  at  Beachy  Head  and  Dunnose,  in 
connection  with  the  survey  of  England,  and  the  results  obtained 
were  poor.  Later  on,  when  the  full  record  of  the  observations  fell 
into  the  hands  of  General  Colby,  it  was  found  that  only  the 
observations  which  were  in  closest  agi-eement  had  been  used  in 
the  reduction.  When  the  calculations  were  repeated,  using  all  the 
observations,  results  of  a  high  degree  of  accuracy  were  obtained. 

*  Astronomical  Journal,  Vol.  iv,  p.  137. 

9—2 
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Though  many  of  the  arguments  of  Airy  and  others  against  the 
use  of  mathematical  criteria  such  as  Peirce's  have  been  shown  to 
be  based  on  faulty  premises,  the  fact  remains  that  none  of  these 
criteria  have  ever  come  into  general  use. 

If  the  distribution  of  errors  be  in  strict  accordance  with  the 
law  of  error,  only  a  few  large  residuals  will  occur.  Referring  to  the 
table  on  page  19,  we  see  that  the  probability  of  an  error  greater 
than  5r,  where  r  is  the  probable  error  of  a  single  observation,  is 
•001,  and  the  probability  of  an  error  greater  than  3'5r  is  "018. 
Thus  only  one  observation  in  1000  should  have  an  error  as  great 
as  5r,  while  one  error  in  55  should  have  an  error  as  great  as  3*5r. 
These  numbers  form  the  theoretical  basis  of  the  following  rule, 
which  is  advocated  for  general  use  by  Wright  and  Hayford*. 

"Reject  each  observation  for  which  the  residual  exceeds  five 
times  the  probable  error  of  a  single  observation.  Examine  care- 
fully each  observation  for  which  the  residual  exceeds  3'5  times  the 
probable  error,  and  reject  it  if  any  of  the  accompanying  conditions 
are  such  as  to  produce  lack  of  confidence." 

This  criterion  has  the  merits  of  simplicity,  ease  of  application, 
and  a  fairly  sound  theoretical  basis.  For  a  moderate  number  of 
observations  it  cannot  be  said  to  be  too  sweeping. 

*  Adjustment  of  Observations. 


CHAPTER  IX 

ALTERNATIVES    TO    THE    NORMAL    LAW    OF    ERRORS 

61.  We  have  already  dealt  briefly  with  one  or  two  cases  in 
which  the  frequency  distribution  showed  a  well-marked  deviation 
from  the  normal  form.  In  economic  and  biological  statistics,  in 
particular,  the  frequency  distributions  are  liable  to  show  con- 
siderable dis-symmetry.  It  is  therefore  necessary  to  consider  the 
methods  of  representing  such  frequency  distributions  by  some 
substitute  for  the  normal  error  curve  of  Gauss.  The  purpose 
which  has  to  be  kept  in  view  is  that  of  replacing  the  series  which 
represents  the  observations  by  a  simple  formula  involving  a  few 
constants  only. 

Before  proceeding  to  the  development  of  possible  formulae 
which  shall  represent  various  types  of  frequency  distribution,  we 
must  define  briefly  a  number  of  statistical  terms.  Let  the  data 
be  arranged  in  the  form  of  a  frequency  distribution,  and  let  f^  be 
the  frequency  of  the  characteristic  x.  The  characteristic  is  the 
scale  in  terms  of  which  the  observations  are  made.  In  the 
diagTams  of  the  earlier  chapters  of  this  book,  it  is  the  variable 
represented  along  the  horizontal  axis  (the  ^-axis). 

The  median  is  the  value  of  the  characteristic  which  has  as 
many  observations  on  one  side  of  it  as  on  the  other. 

The  mode  is  the  value  of  the  characteristic  corresponding  to 
the  maximum  ordinate  of  the  frequency  curve.  The  position  of 
this  ordinate  cannot  be  accurately  determined  until  the  form  of 
the  frequency  curve  is  known,  since  it  is  not  necessarily  the 
ordinate  corresponding  to  the  biggest  number  of  actual  observa- 
tions.    (Compare  figure  7,  page  46.) 

The  mean  is  the  average  value  of  the  characteristic,  and  is 
the  arithmetic   mean  of  all  the  observed  values.     If  fx  be  the 


134     ALTERNATIVES  TO  THE  NORMAL  LAW  OF  ERRORS     [CH. 

frequency  of  a  value  x  of  the  characteristic,  or  the  frequency  of 
a  group  centred  round  x,  the  mean  is 

v. ' 

Sf^xdx 
or,  for  the  curve,  '  j.    ,     . 

jjxdx 

The  ordinate  through  the  mean  is  often  spoken  of  as  the  centroid 
vertical,  since  it  passes  through  the  centre  of  gravity  of  the 
distribution. 

The  standard  deviation,  or  S.D.,  represented  by  the  symbol  a, 
measures  the  closeness  with  which  the  measurements  are  clustered 
about  the  mean.     Measuring  x  from  the  mean,  a  is  given  by 

,=  =  2^-   or   a^J^A^ 
^fx  Jfxdx 

The  first  or  second  form  is  to  be  used  according  as  the  calcu- 
lation is  made  from  the  observations  or  from  the  curve.  Using 
the  notation  of  Chapter  III,  we  should  have 

0-2  =  -!: — —  where  n  =  ^fx. 
n  ^ 

The  mean  square  error  is  given  by 

'^       n  —  \ 


In—l 
Thus  <T  =  ^  \/  ,  and  when  n  is  large,  a  and  fju  may  be  regarded 

as  identical.  For  the  present  we  shall  call  this  quantity  the  S.D. 
(o-),  rather  than  the  M.  s.  E.  {fx),  since  it  is  customary  to  do  so  in 
all  works  on  general  statistics.  It  should  be  noted  that  cr  does  not 
depend  upon  the  actual  frequencies,  but  only  on  their  distribution. 
It  measures  the  scatter  of  the  observations  about  the  mean. 

When  a  curve  is  symmetrical,  the  mean,  mode,  and  median 
coincide.  An  unsymmetrical  curve  is  often  described  as  a  skew 
curve.  The  skewness  of  a  curve  or  of  a  frequency  distribution  is 
measured  by  the  distance  between  the  mean  and  the  riiode.  It  is 
convenient  to  define  the  skewness  by  the  equation 

Mean  -  Mode 

Skewness  = , 

S.D. 
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SO  that  a  curve  with  positive  skewness  has  the  mode  to  the  left  of 
the  mean  in  the  usual  diagram,  or  in  other  words,  when  the  rise 
to  the  maximum  is  more  rapid  than  the  fall  from  the  maximum. 
In  a  skew  curve  the  mode  is  the  value  of  the  characteristic  which 
has  the  greatest  frequency. 

The  7ith  moment  of  a  frequency  distribution  about  any  ordinate 
is  obtained  by  multiplying  each  frequency  by  the  nth  power  of  its 
distance  from  the  ordinate  in  question,  and  adding  together  all 
the  products.  With  our  present  notation,  the  ?ith  moment  about 
the  ordinate  ^  =  a  is 

It  is  customary  to  employ  the  symbol  fi^  to  represent  the  nth. 
moment  about  the  mean,  and  the  accented  symbol  [Xn  to  denote 
the  nth  moment  about  any  other  ordinate.  It  is  generally  more  con- 
venient to  evaluate  moments  about  some  other  ordinate  than  the 
mean,  e.g.  an  ordinate  corresponding  to  an  integral  value  of  the 
characteristic,  and  then  to  deduce  from  these  moments  the  corre- 
sponding moments  about  the  mean.  Let  the  distance  between 
the  assumed  ordinate  and  the  mean  be  a,  and  let  the  distance 
from  any  other  ordinate  to  these  two  ordinates  be  xj  and  Xy 
respectively.     Then  Xr  =  Xr  —  a. 

fjLn=tfr(Xr-ay' 

=  l^frXy'^  —  na  ^frXr'^~^  +  CtC. 

,  n.(n—\)^,  ^  ,^- 

=  ^In   -  nClfl  a-i  +   Y^ <^^>  n-2  "  ©tC (1). 

Or,  we  may  write 

fji,:  =  :s./],{x;r  =  Sfr{xr-\-a)^ 

n  .n  —  1    „ 
=  fin  +  na/xn-i  +     -.    ^     <:tVn-2  +  . . . . 

Whence  it  follows  that 

n.n-1    , 

^a  =  f^n   -nctfln-i p^-  tt"  yL^H-2  "  etC (2). 

Either  of  equations  (1)  and  (2)  may  be  used  to  evaluate  the 
moments  in  turn,  the  two  equations  of  necessity  leading  to 
identical  results. 
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62.  Most  of  the  methods  of  fitting  curves  other  than  the 
normal  error  curve,  to  series  of  observations,  are  based  on  the  use 
of  moments.  If  the  functional  form  contains  n  constants,  these 
are  deduced  by  making  the  moments  deduced  from  the  obser- 
vations agree  with  the  moments  deduced  from  the  curve,  up  to 
the  nth.  moment.  This  process  yields  n  equations  which  suffice  to 
determine  the  n  constants.  The  following  simple  example  may 
help  to  illustrate  the  utility  of  the  method. 

It  is  required  to  fit  a  curve  of  the  form 


y  =  a  +  bi 

X  +  cx^ 

to  the  frequency  distribution 

x  =  l 

y  =  10 

2 

12 

3 

18 

4 

36 

If  only  the  first  three  terms  corresponding  to  x=l,  2,  and  3, 
were  given,  then  by  substituting  the  values  of  x  and  y  in  the 
assumed  equation  we  could  solve  the  resulting  three  equations, 
and  so  obtain  a,  b,  and  c.     The  results  so  obtained  are 

a  =  12,         6  =  -4,         c  =  2. 

Thus  the  curve  which  accurately  fits  the  first  three  terms  is 

y=12-4^x  +  2x\ 

But  when  we  substitute  ^  =  4  in  this  equation  we  find  y  =  28. 
Thus  the  curve  does  not  fit  the  fourth  term  x  =  ^,  y  =  36.  If  any 
three  out  of  the  four  points  given  above  were  selected,  a  curve 
could  be  made  to  pass  through  these  three  points,  but  it  would 
not  pass  through  the  fourth  point.  It  is  therefore  necessary  to 
adopt  some  method  of  calculating  the  constants  a,  b,  c,  so  that  all 
four  points  will  be  taken  into  account.  The  result  will  be  to 
yield  a  curve  which  will  pass  very  near  to  all  four  points,  but 
which  need  not  of  necessity  pass  through  any  one  of  them.  This 
could  easily  be  done  by  the  method  of  least  squares.  Another 
simple  method  is  to  adopt  values  of  the  constants  a,  b,  c,  such  that 
the  first  three  moments  of  the  actual  distribution  about  the  origin 
shall  be  equal  to  the  first  three  moments  yielded  by  the  curve. 
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Taking  the  zero  moment,  and  the  first  and  second  moments,  we 
obtain  three  equations : 

(a  +  6  +  c)+     (a  +  2b-\-2'c)-{-     (a  +  36  +  3-c)+     (a  +  46  +  4-c) 
=  10+12  +  18  +  36, 
1  (a  +  6  +  c)  +  2  (a  +  26  +  2-c)  +  3  (a  +  36  +  S^c)  +  4  (a  +  46  +  4^0) 

=  10.1  +  12.2  +  18.3+36.4, 
V(a  +  b-hc)  +  2'  (a  +  26  +  2'~c)  +  3^'  (a  +  36  +  S'^c)  +  4^  (a  +  46  +  4-^c) 

=  10   ^^  +  12.2-  +  18.3•^  +  36.4^ 
or,  4a  +    106+    30c  =    76, 

10a  +    306+ 100c  =  232, 
30a +  1006  + 354c  =  796. 
The  solution  of  these  equations  yields 

a  =  18,         6  =  -11-6,         c  =  4. 
The  resulting  curve  is 


X 

Calculated  rj 

Observed  y 

1 

10-4 

10 

2 

10-2 

12 

3 

19-2 

18 

4 

35-6 

36 

The  table  shows  that  the  adopted  curve  does  not  fit  any  of  the 
observations  accurately,  but  yields  values  of  y  which  are  sometimes 
greater,  sometimes  less,  than  the  observed  values. 

This  simple  example  may  help  to  illustrate  the  general  nature 
of  statistical  problems,  where  it  is  necessary  to  replace  the  irre- 
gularities of  observations  by  a  smooth  curve. 
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63.     Pearson's  Curves. 

The  general  theory  of  curve-fitting  has  been  worked  out  in 
great  detail  by  Prof.  Karl  Pearson*.  Starting  from  certain 
properties  which  may  be  considered  essential  in  good  observations, 
Pearson  has  derived  a  series  of  formulae  for  possible  curves  of 
presumptive  errors. 

(1)  The  expression  must  replace  the  rough  material  of 
observation  by  a  smooth  continuous  curve ;  i.e.  it  must  graduate 
the  observations. 

(2)  The  expression  must  not  involve  too  many  constants,  and 
those  present  must  be  calculable  from  the  material  of  observation. 

(3)  There  must  be  a  systematic  method  of  approaching 
frequency  distributions. 

(4)  If  the  material  is  homogeneous,  the  ordinate  of  the  curve 
will  start  from  zero,  increase  to  a  maximum,  and  fall  again, 
possibly  at  a  different  rate,  to  zero. 

(5)  The  frequency  curve  will  generally  have  contact  with 
the  axis  of  x  at  the  ends  of  the  range. 

Of  these,  (1),  (2)  and  (3)  call  for  no  remark,  while  (4)  and  (5> 
are  properties  which  are  generally  associated  with  the  frequency 
distributions  'obtained  in  actual  practice.  The  conditions  are  in 
general  satisfied  by  the  curve  whose  equation  is 

1  dy  _x  +  a 
ydx~  f(x)  ' 

For,   -^  =  0  when   y  =  0,  so   that  the  curve  has  contact  at  the 

axis,  and  -^  =  0  when  x  =  —  a,  corresponding  to  a  maximum  of 

the  curve. 

Expanding  by  Maclaurin's  Theorem,  we  may  write 

f{x)  =  6o  +  biX  +  h2X^  -i- ...  etc. 

Pearson  has  considered  in  detail  the  case  where  f{x)  is  limited  to 
the  first  three  terms  in  the  expansion.  Four  constants  are  then 
involved,  requiring  the  evaluation  of  four  moments.  In  general 
there  is  little   advantage  in  considering  formulae  which  require 

*  Phil.  Tram.  E.S.  186  a,  p.  343;  Biometrika,  v,  p.  172. 
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the  evaluation  of  higher  moments,  since  the  higher  moments  are 
very  sensitive  to  errors  in  the  frequency  distribution.  We  shall 
therefore  start  from  the  differential  equation 

1  dy  _         oc-\-  a                                      ,^. 
y  dx     60  +  61^  +  6.2^- 

Pearson's  method  consists  essentially  in  making  the  moments 
calculated  from  the  curve  equal  to  the  moments  derived  from  the 
observations.     The  differential  equation  may  be  written 

{\  +  h^x  +  hox") -^  =  y  {x  +  a). 

Multiplying  each  side  by  x''\  and  integrating,  we  find 

I  x^  (bo  +  b^x  +  h^x")  -^dx=y(x+a)  x''  dx. 

di/ 
Integrating  the  L.  H.  s.  by  parts,  treating  -^  as  one  part,  we  find 

^«  (bo  +  b,x  +  b^x^)  y  -  \{nboX''-'  +  (??  +  1)  b,x''  +  (n  +  2)  b2x''+'}  ydx 

=  I  y  x^'^'^dx  +  a  I  yx^dx. 

But  since  y  becomes  zero  at  each  end  of  the  range,  we  find,  in 
terms  of  our  previous  notation, 

-  nbofin-i  -  (?i  +  1)  61  /x/  -(n  +  2)  b^f^'n+i 

=  /a'„+i  +  afMn, 

or  a/jLn   +  nbo  /Jb'n-i  +  (n  +  1)  61  /Xn   +  {)l  +  2)  b.  fl'n+i  =  -  /n+i- 

Putting  n  =  0,  1,  2,  3,  in  turn,  in  this  equation,  we  might 
obtain  four  equations  to  determine  the  four  constants  a,  bo,  b^,  bo. 
These  equations  were  derived  independently  of  any  assumption  as 
to  the  position  of  the  origin.  If  the  origin  be  at  the  mean,  the 
accents  may  be  omitted,  and  /u-j  may  be  equated  to  zero.  The  four 
equations  then  become 

71  =  0  a  +  6i  =  0 

n  =  l  60  +  Sbo/jLo  =  -  fjh 

n  =  2  a/jLo  +  36i  yu-o  +  •^b./j,-^  =  —  /^s 

n  =  S  ayLt3  +  36o/u.o  +  46iyLt3  +  56oyLt4  =  — yLtj 


,(4). 
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Sheppard's  Corrections.  The  fx's  in  these  equations  represent 
moments  calculated  from  the  curve.  In  practice,  however,  the 
moments  are  calculated  from  grouped  frequencies,  where  all  values 
of  the  characteristic  within  a  certain  interval  are  regarded  as  equal 
to  the  value  at  the  centre  of  that  interval.  The  moments  so 
calculated  are  not  precisely  the  same  as  the  moments  calculated 
from  the  curve.  If  v^,  v-2,  etc.  be  the  moments  calculated  from 
grouped  frequencies,  Sheppard  *  has  shown  that  when  there  is  high 
contact  at  the  axis,  certain  corrections  must  be  applied  to  the 
moments.  The  relations  between  the  various  moments  are  shown 
by  the  following  equations  : 

(o). 

/^4  =  ^4  -  ^V2  +    210   =  ^4  -  i/^2  -   sV 

Using  accented  v^  to  denote  the  moments  of  the  grouped  observa- 
tions about  other  ordinates  than  the  centroid  vertical,  the  relations 
(1)  and  (2)  above  hold  for  Vn,  Vn,  etc.  There  are  thus  three  stages 
in  the  evaluation  of /^i,  //,,?  etc. 

(1)  Evaluate  v^  for  ??  =  1,  2,  3,  etc.  about  any  convenient 
ordinate. 

(2)  Transform  to  the  centroid  vertical  by  using  equations 
(1)  or  (2),  so  obtaining   v^^  v^,  etc.         1^1  =  0. 

(3)  From  these  values  deduce  /jl^,  /jl^,  /jl^,  etc.  by  the  use  of 
equations  (5)  above.  It  should  be  remembered  that,  referred  to 
the  centroid  vertical,  ijiy—v^  =  0. 

The  solution  of  equations  (4)  for  a,  h^,  Ih,  K  is  quite  straight- 
forward.    The  resulting  form  of  equation  (3)  is 

_\dy ^  ^  10fi.^uL,-l^ixi-12^i 

y   dx          yLt2(4yU,2;Lt4  — 3/L63-)  +  fJij  (/L64  +  3^2^)  X  +  {^fJL^fJb^  —  8/^3^  —  GyLta^)  ^'^  ' 


*  W.   F.    Sheppard,    Proc.   L.M.S.    xxix,    p.    353;    see    also,    Karl   Pearson, 
Biometrika,  iii,  p.  308. 
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If  in  this  last  form  we  substitute 

we  obtain  the  equation  in  the  form 
y  dx 


x  + 


2        h(S,-QiS,-^ 


4/3, -3/g,  V/3,       /3,  +  3       X      2/3,-3/3i-6     /^y]  ' 

10^,-12^-18"^   2    5/:?2-6A-9(7'^10/3,-12^i-18U/; 

This  equation  is  referred  to  the  mean  as  origin.  The  mode  is 
obtained  by  making  the  numerator  in  the  R.  H.  s.  zero.  It  follows 
that  the  skewness  of  the  curve  is 

VA         A  +  3 


2    'Sye^- 6^^1-9* 


The  curve  is  symmetrical  when  ySi  =  0. 

The  form  of  the  curve  is  fixed  by  the  nature  of  the  roots  of 
the  equation 

ho  +  h^x  +  h.x^  =  0, 

i.e.  by  the  value  of  ^i^  —  45^^^-  qj.  by  ^_  ^ 

Substituting  the  values  of  ho,  h^,  6,  derived  above,  we  find 

46o6,     4(2y5,-3A-6)(4;e,-3A) ^""^ 

The  value  of  this  function  of  the  moments  fixes  the  nature  of 
the  curve  of  frequencies.     Hence  it  is  known  as  the  criterion. 

64.     Integration  of  the  Differential  Equation. 

Returning  to  the  general  differential  equation 

1  dy  _         x-\-  a 

y  dx     ho  +  h^x -{- hzx^ ' 

we  find  that  seven  different  types  of  curves  can  be  derived  from  it 
according  to  the  nature  of  the  constants  involved. 
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I     If  b,  =  h,  =  0, 

1  dy     X  -{-  a 


then 


y  dx         6, 


,  x^      ax 

(.X  4-  of 

where  h^  must  be  negative. 

This  is  Gauss's  Normal  Error  curve,  with  the  mean  at  ^  =  —  a. 
In  this  case  the  criterion  is  zero,  A  being  zero. 

II.     If  62  =  0,  then 

1  dy  _  x  +  a    _  1  bi 


y  dx      bo-\-biX     b^      bo  +  bix' 

X       1  /        b  \ 

and  log  3/  =  ^  +  ^  f  <x  -  ,-  J  log  (60  +  61^)  4-  constant, 

abi  -  bp     x^  x+a  ab^  —  bo 

Removing  the  origin  to  x  =  —  a,  and  writing 

,      h  1 

l  =  --a,         y  =  -j^, 

we  obtain  the  equation  in  the  form 

y  =  y,[l  +  -^   e-y^. 

The  curve  is  limited  in  one  direction,  and  is  skew,  the  origin 
being  at  the  mode. 

The  value  of  the  criterion  in  this  case  is  00 ,  this  value  being 
due  to  the  relation 

2/32- 3A- 6  =  0. 

III.     When  the   roots  of  b^  +  b^x  +  h.x^  =  0   are    real  and  of 
opposite  sign,  the  equation  may  be  written 

b^dy^  x  +  a         _  c^-g      1         c^-\- a      1 

y  dx     (^  +  Ci)(^-C2)  ~Ci  +  C2  ^  +  Ci      C1  +  C2  x-c^' 

.-.     60  log  y  =  -— —  log  (x  +  Ci)  + log  (x  -  C2)  +  constant. 

^1  "T"  ^2  ^1  +  C3 
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Removing  the  origin  to  x  =  —  a,  and  writing 

1 

ai  =  Ci-  a,         ^2  =  Ca  +  a,  v  =  j— — — — r , 

this  equation  becomes 

log  y  =  va^  log  {x  +  a^  +  va^  log  {x  —  a^)  +  constant, 


or 


the  origin  being  at  the  mode. 

The  criterion  is  negative.  The  curve  is  limited  in  both 
directions,  and  is  skew. 

IV.  When  the  roots  of  ho  -f  h-^x  +  Kx^  =  0  are  real,  of  opposite 
sign,  but  equal  in  magnitude,  we  should  have  Ci  =  Ca  in  III,  and 
consequently  ai  =  aa. 

The  resulting  equation  is 


/        xy 


referred  to  the  mode  as  origin. 

This  is  a  symmetrical  curve,  limited  in  both  directions. 

The  nature  of  the  roots  requires  that  b^  =  0,  and  therefore  /3i  =  0. 

Hence  the  criterion  is  zero. 

V.    When  the  equation  60  +  61^+62^^  =0  has  real  and  equal  roots, 

_b^ 
1  dy  x-\-a  1         1  1      ^     260 

which  on  integrating  yields 

log  2/  =  -  y  (»  -  26  )  r  ^  h  ^"^^  V  "^  ^)  ^  ^^^^*^^^- 

Removing  the  origin  to  ^  =  —  ^y-  ,  and  writing 


p^h 


\k~k 
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we  may  write  this  equation  in  the  form 

log  y  =  —  --\-p  log  OS  +  constant, 

_  7 

or  y  =  yoW^e    ^. 

This  is  a  skew  curve,  with  a  limited  range  in  one  direction. 
The  criterion  is  unity. 

VI.  When  the  roots  of  the  equation  ho-\-hiX-\-  h^^^  =  0  are 
real  and  unequal,  and  of  the  same  sign. 

Let  the  roots  be  Ci  and  Co.     Then,  as  in  type  III, 

^0  log  y  =  ^^  log  {x  -  Ci)  -  ^-^  log  (x  -  c). 

Changing  the  origin  to  ^=  Cj,  we  may  write 

y  =  yo  x'^^  {x  —  c)~^^. 
The  criterion  is  positive  and  greater  than  unity. 

VII.  When  the  roots  are  complex,  the  differential  equation 
may  be  simplified  slightly  by  transferring  the  origin  to 

and  puttmg  c  =  a  —  -^  , 

and  ^■'  =  ir-TTi- 

O2      462' 

It  then  reads 

\dy_x-^c__lx  c        1 

ydx~  b.2(x'-  +  d') "  K,  x^  +  d""  "^  J^  oF+d^ ' 
1  c  cc 

whence         log  2/  =  ^t"  log  (^  +  ^')  +  r^  ^^^~'  2  ^  ^^^^^^^*' 

1        e  ^       ,x 


e 


V tan-1 - 
a 


or  y=yo 


This  is  a  skew  curve  with  unlimited  range. 
The  criterion  is  positive  but  less  than  unity. 
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The  value  of  the  criterion  is  clearly  sufficient  to  determine 
which  of  the  seven  types  given  above  will  best  represent  the 
statistical  data  under  consideration.  When  the  criterion  is  negative 
the  curve  is  always  of  type  III,  when  it  is  positive  but  less  than  1, 
the  curve  is  of  type  VII,  and  when  it  is  positive  and  greater 
than  1,  the  curve  is  of  type  VI.  These  include  the  majority  of 
cases  that  can  arise,  but  at  the  points  where  the  curve  changes 
from  one  type  to  another,  a  transition  curve  can  be  used.  Thus 
when  the  criterion  is  oo  ,  a  curve  of  type  II  can  be  used,  and  when 
the  criterion  is  unity  the  curve  is  of  type  V.  When  the  criterion 
is  zero,  on  account  of  /3i  being  zero,  two  types,  I  and  IV,  are  possible. 
The  former  is  the  normal  curve  of  errors,  for  which  /X4  =  S/Zg-  (see 
page  56)  and  so  /S^  =  3.  Thus  when  /3i  =  0  and  the  criterion 
vanishes,  the  curve  is  of  type  I  if  /3o  =  3,  and  of  type  IV  if  /3o  has 
any  value  other  than  3. 

For  a  detailed  application  of  the  above  types  of  curves  to  the 
adjustment  of  statistical  data,  the  reader  is  referred  to  Frequency- 
Curves  and  Correlation  by  W.  Palin  Elderton. 

65.     Use  of  the  series 

y  =  Ao(l>  (a;)  +  Aocj)'"'  (./;)  +  A,(f)^''  (x)  +  etc. 

In  this  series  A^^,  A^,  A^,  etc.  are  constants,  and  (j){x)  is 
given  by 

*  ^"^ = ^  *^""'     "^"'  ^"^  =i  "^  ^"^'  '*'•■ 

a  being  the  S.  D.  of  the  distribution. 

The  use  of  this  form  of  frequency  curve  has  been  proposed 
by  Thiele*,  Edgewortht,  and  CharlierJ.  It  should  be  noted  that 
the  first  term  of  the  series  gives  the  normal  error  curve.  The 
next  term  introduces  skewness  into  the  curve,  while  the  effect  of 
the  third  term  is  symmetrical.  Charlier,  in  the  first  paper  referred 
to,  developed  this  form  of  the  error  law  from  the  hypothesis  that 
an  error  is  made  up  of  a  large  number  of  small  errors,  each  of 
which  has  its  own  error  law.  Edgeworth  made  considerable  use  of  a 
functional  form  which  involved  only  the  first  two  terms  in  the  series. 

*  Theory  of  Observations,  London,  1903. 
t  Cavih.  Phil.  Tram.,  Vol.  xx,  pp.  36-65,  113-141. 

X  Arkiv  for  Matematik,  Vol.  11,  Stockholm,  1905,  "  Uber  das  Felilergesetz  " ; 
Lunds  Meddelanden,  1906,  "Researches  into  the  Theor}-  of  Probability.'' 
B.  O.  10 
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Charlier  fits  the  curve   to  the   observations  by  the  method 
of  moments.     With   the    notation   of  the   present   chapter,   this  , 
gives 

b  =  yLt/  a'-  =  /Xo 

SlA,  =  -/jL, 

5lA,  =  -fjL,+  10a'fjL, 

6\  Aq  =  fjiQ—  15a^/jLi  +  15a^. 

It  is  thus  not  difficult  to  fit  a  curve  to  any  given  series  of 
observations. 

66.  Other  forms  of  possible  frequency  curves  might  be 
suggested,  and  with  most  of  these  it  is  not  difficult  to  fit  theory 
to  fact.  The  difficulty  lies  rather  in  finding  some  means  of  esti- 
mating the  relative  values  of  the  different  laws  of  presumptive 
errors.  We  ask  in  vain  for  a  fixed  rule  by  which  the  most  important 
and  trustworthy  forms  can  be  selected.  The  difficulty  is  scarcely 
minimised  by  the  fact  that  it  is  often  possible  to  obtain  fairly 
good  representations  of  a  given  set  of  observations  by  two  curves 
whose  functional  forms  differ  very  widely.  Thus  Elderton,  in  his 
excellent  book  on  Frequency -Curves  and  Correlation,  shows  that 
a  certain  series  of  observations  can  be  almost  equally  well  repre- 
sented by  the  two  forms 

of'  \      -4-4504  tan-' 


2/  =  2/o( 


1  + 


13-39152 


(18-39152y, 

and     y  =  4302  [2-127818  <^  {x)  +  '012208  x  (2127818)^  ^"'  {x) 

+  •007079  X  (2-127818^  ^i^(^)], 

while  another  series  of  observations  can  be  represented  by  the 
two  forms 

/  ^2  \  4-141766 

v  =  462-57    1  ^ 


(4-543079)V 

and       y  =  1244-4  [crt/)  {x)  -  '0081  G'<f>"'  (x)  -  '01882(75  </)i^  {x% 
where  0-2  =  1-829172. 
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All  that  seems  at  all  definite  is  that  with  the  series 

y  =  A,<i>{x)  +  A.^'"  {x)  +  ^4(/)^^  {x)  4-  etc. 

it  is  difficult  to  gi^aduate  a  very  skew  distribution,  or  one  that 
rises  very  rapidly  from  the  axis.  For,  to  do  so  would  involve  the 
use  of  a  large  number  of  terms  of  the  series,  so  involving  the 
higher  moments,  whose  probable  errors  are  considerably  greater 
than  those  of  the  lower  moments.  In  such  cases  it  is  better  to 
adhere  to  Pearson's  family  of  curves.  The  straightforward  use  of 
the  criterion  will  lead  to  the  type  of  curve  which  should  give 
the  best  fit. 


10—2 


CHAPTER   X 


CORRELATION 


67.  In  the  preceding  chapters  we  have  only  considered 
frequency  distributions  due  to  a  single  variable,  or  to  more  than 
one  independent  variable.  In  the  present  chapter  we  shall 
consider  the  case  w^here  the  variables  are  correlated.  The  method 
may  be  illustrated  by  a  simple  example.  The  first  two  columns 
of  the  table  on  p.  157  below  give  the  orbital  period  and  duration 


250 


10     12     14     16 
Duration  of  eclipse. 

Fig.  9. 


18     20     22     24     26/ 


of  eclipse  of  38  Algol  stars*.  These  are  represented  diagram- 
matically  in  figure  9.  The  horizontal  scale  {x)  represents  the 
duration  of  eclipse  in  hours,  and  the  vertical  scale  {y)  represents 
the  orbital  period  in  hours.     Each  dot  in  the  diagram  represents 

*  Father  Stein,  Monthly  Notices,  R.A.S.,  Vol.  lxix,  p.  450. 
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one  star.     Such  a  diagram  may  be  conveniently  called  a  "  scatter 
diagram." 

The  points  in  the  scatter  diagram  are  roughly  grouped  about 
a  straight  line,  showing,  as  might  have  been  anticipated  a  priori, 
that  the  duration  of  eclipse  generally  increases  as  the  orbital 
period  increases.  The  total  range  of  duration  of  eclipse  is  sub- 
divided into  ranges  2^ — 4^,  4^' — 6^\  etc.  The  mean  orbital  period 
for  all  the  stars  in  each  of  these  intervals  is  calculated,  and  repre- 
sented in  the  diagram  by  means  of  a  x  placed  at  the  middle  of  the 
range,  e.g.  the  mean  orbital  period  of  stars  whose  duration  of 
eclipse  is  between  10^  and  12^  is  79^  and  this  mean  is  represented 
in  the  diagram  by  a  x  at  ],l^  79*".  Similarly  the  mean  duration 
of  eclipse  for  stars  ^vhose  orbital  periods  are  between  10^^  and 
40*",  between  40^  and  70^  etc.,  are  represented  in  the  diagram  by 
small  circles. 

A  straight  line  is  drawn  to  fit  the  series  of  crosses  as 
accurately  as  possible.  In  this  particular  instance  the  crosses 
do  not  lie  accurately  on  the  straight  line,  but  lie  irregularly  on 
either  side  of  it.  This  straight  line  is  called  the  line  of  regression 
of  y  on  X.  Its  ordinate  gives  the  mean  value  of  3/,  which  we  may 
expect  to  find  associated  with  a  given  value  of  x.  Similarly  the 
straight  line  which  fits  most  accurately  the  series  of  small  circles 
is  called  the  line  of  regression  of  x  on  y.  Its  abscissa  gives  the 
mean  value  of  x  corresponding  to  a  given  value  of  y.  It  should  be 
noted  that  the  two  lines  of  regression  do  not  coincide,  though  the 
angle  between  them  is  small.  When  the  means  lie  fairly  accurately 
on  a  straight  line,  the  regression  is  said  to  be  linear.  But  it  may 
happen  that  the  means  lie,  not  on  a  straight  line,  but  on  a  well- 
defined  curve.  The  curve  is  called  the  curve  of  regression,  and 
the  regression  is  then  defined  as  non-linear.  Such  a  curve  would 
be  obtained  if  a  number  of  observations  of  the  volume  {y)  and  the 
pressure  {p)  of  a  given  mass  of  gas  at  a  constant  temperature 
were  represented  by  a  scatter  diagram.  The  points  in  the 
scatter  diagram  would  lie  closely  about  a  rectangular  hyperbola, 
whose  equation  is  j9v  =  constant.  Even  in  cases  where  there  is 
a  clearly  defined  non-linear  curve  of  regression,  the  straight  lines 
which  fit  most  closely  the  series  of  means  are  called  the  lines  of 
regression. 
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Generally  speaking,  it  is  only  in  dealing  with  isolated  physical 
phenomena,  in  which  the  conditions  of  observation  can  be  com- 
pletely controlled,  that  we  shall  find  a  clearly  defined  functional 
relation  between  the  two  variables  considered.  When  the  factors 
which  vary  are  complex  and  not  controllable  by  the  observer,  the 
curve  of  regression  does  not  as  a  rule  indicate  a  simple  functional 
relationship  between  the  two  characters  considered.  The  com- 
plete problem  of  the  statistician,  which  is  to  find  formulae  which 
will  represent  with  sufficient  accuracy  the  form  of  the  curve  of 
regression,  is  not  in  general  capable  of  solution.  But  since  the 
majority  of  the  problems  of  the  practical  statistician  relate  solely 
to  averages,  it  is  sufficient  in  many  cases  to  be  able  to  state 
whether,  on  an  average,  there  is  a  tendency  for  high  values  of 
one  of  the  characters  to  be  associated  with  high  values  (or  low 
values)  of  the  other.  If  possible  it  is  also  desirable  to  find  how 
great  a  divergence  of  one  character  from  its  mean  value  is 
associated  with  a  unit  divergence  of  the  other  from  its  mean 
value;  and  also  how  closely  this  relation  is  usually  fulfilled. 
These  questions  can  be  largely  decided  by  fitting  a  straight  line 
to  the  series  of  means  obtained  as  in  figure  9,  i.e.  by  drawing  the 
lines  of  regression. 

68.  As  an  alternative  to  the  scatter  diagram,  we  might 
represent  the  frequency  distribution  of  two  variables  by  means 
of  a  table  of  double  entry,  or  a  contingency  table.  The  following 
contingency  table  represents  the  material  in  the  scatter  diagram 
of  figure  9. 

Each  row  in  this  table  gives  the  frequency  distribution  of 
the  duration  of  eclipse  for  a  given  range  of  orbital  period,  while 
each  column  gives  the  frequency  distribution  of  the  orbital 
period  for  a  given  range  of  duration  of  eclipse.  As  the  columns 
and  rows  are  only  distinguished  by  the  accidental  circumstance 
that  one  runs  vertically  and  the  other  horizontally,  the  word 
"array"  is  used  to  denote  either  a  row  or  a  column. 

The  choice  of  class-intervals  is  to  a  large  extent  arbitrary, 
and  in  general  any  interval  which  happens  to  be  convenient  may 
be  selected.  When  this  choice  has  been  made,  the  contingency 
table  can  be  completed  in  a  number  of  ways.  If  a  scatter 
diagram   such    as    that  of  figure   9    has   been   made,    the   class- 
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intervals  may  be  ruled  in  the  diagram,  and  the  number  of  dots 
in  each  compartment  may  then  be  counted.  If  such  a  scatter 
diagram  is  not  available,  a  form  such  as  that  shown  below  should 
be  ruled  on  a  large  sheet  of  paper,  with  the  class-intervals  headed 
as  in  the  final  table.  Each  observation  can  be  represented  in  this 
table  by  a  cross  in   the  corresponding  compartment.     The   sum 


Duration  of  eclipse  in 

hours. 

2— 

4— 

6— 

8— 

10— 

12— 

14- 

16— 

18— 

20— 

22—  24— '  Total 

i          i 

m 

o 
a 

i 
a 

IS 

o 

10— 
40— 
70— 
100— 
130— 
160— 
190— 
220— 

2 

6 
1 

3 

2 
3 

3 

4 
2 

2 
2 

1 
1 

1 

1 

1 

1 
1 

1 

8 
11 

9 

4 
2 

1 
1 
2 

Total 

2 

7 

3 

5 

9 

4 

2 

2 

1 

2 

— 

1 

38 

of  the  crosses  in  a  compartment  is  then  entered  in  the  corre- 
sponding compartment  of  the  final  table.  It  should  be  noted 
that  any  compartment  may  contain  halves  and  even  quarters  of 
a  frequency.  For  if  an  observation  falls  exactly  on  the  dividing- 
line  between  two  compartments,  it  is  counted  as  a  half  in  each 
of  those  compartments ;  and  if  an  observation  falls  exactly  at  the 
common  angular  point  of  four  compartments,  it  is  counted  as 
a  quarter  in  each  of  the  four  compartments. 

When  there  is  a  functional  relation  between  the  two 
characters  considered,  as  Avill  often  happen  in  the  problems  of  the 
physicist  or  chemist,  there  will  be  entries  in  only  a  few  com- 
partments of  the  contingency  table.  The  way  in  which  the 
entries  are  grouped  will  afford  some  idea  of  the  relationship 
between  the  two  variable  characters.  Thus  in  the  contingency 
table  above,  the  entries  run  diagonally  across  the  table,  showing 
distinct  correlation  between  the  two  characters  considered. 
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When  the  means  represented  by  crosses  (or  circles)  in  the 
scatter  diagram  (see  figure  9)  lie  accurately  on  a  straight  line, 
the  two  variables  considered  are  connected  by  a  linear  relation, 
and  are  said  to  be  completely  correlated.  In  practice,  however, 
this  seldom  occurs,  and  the  means  lie  irregularly  about  a  straight 
line.  The  straight  line  which  best  fits  the  series  of  means  might 
be  drawn  by  a  simple  graphical  method,  say  by  means  of  a 
stretched  thread  moved  about  until  as  many  of  the  means  lie  on 
one  side  as  on  the  other.  But  such  a  method  would  generally 
allow  of  our  drawing  a  number  of  straight  lines,  any  one  of  Avhich 
would  apparently  be  as  good  a  fit  as  any  other.  It  is  therefore 
necessary  to  adopt  some  standard  method  of  drawing  what  shall 
be  regarded  as  the  best-fitting  straight  line.  The  method 
commonly  adopted  is  based  on  the  theor}^  of  least  squares,  but 
it  should  be  remembered  that  this  step  is  arbitrary,  and  that 
other  methods  might  be  suggested  which  would  yield  equally 
good  results.  The  method  of  moments  yields  precisely  the  same 
results  as  the  least  squares  method*. 

69.     The   total  range   of  .r-variation  is  divided    into  a   con- 


^p  y^ 

Fig.  10. 

venient  number  of  intervals.  Let  there  be  n^  observations, 
i.e.  Ux  y's,  in  any  one  of  these  intervals,  and  let  the  mean  of 
these  y's  be  represented  by  the  ordinate  of  the  point  P  in 
figure  10.  A  number  of  points  such  as  P  will  be  represented 
in  the  completed  figure,  each  point  corresponding  to  the  mean 
of  all  the  y's  in  one  ^-interval.  The  curve  of  regression  passes 
through  all  these   points.     We  now  have   to  draw  the  straight 

*  Elderton,  Frequency-Curves  and  Correlation,  p.  114. 
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line  which  shall  best  fit  all  these  points,  or,  in  other  words,  we 
have  to  find  the  equation  of  the  line  of  regression  of  y  on  x. 

Referred    to   axes   through    the   means  of  x   and   y,  let  the 
equation  of  the  line  of  regression  be 

Let  the  ordinate  of  P  meet  the  line  of  regression  at  the  point 
M.  Then  the  usual  least  squares  method  of  adjustment  is  to 
make  l^n^PM'^  a  minimum.  Let  the  ordinate  of  P  be  y^,  and 
of  M,  Y.     Then  we  have  to  make 

V=^n,{Y-y,r 
a  minimum,  where  the  summation  extends  to  all  the  ^'-intervals. 
Since   Y  =  a  +  hX  for  the  point  M, 

=  tn^  \a'  +  ¥X'  +  2ahX  4-  yj"  -  2ai/^  -  2bXy^}. 
But  since  the  origin  is  at  the  mean  of  x  and  y, 
tn^yx  =  0,         tn^X  =  0, 
and  therefore 

V=  lu:,  {a'  +  ¥X-'  +  yx'  -  '2bXy^}. 

This  last  form  of  V  clearly  demands  a=0  for  'its  minimum 
value.     Differentiating  with  respect  to  h, 

2bln^X'-2^n^Xy^  =  0 

Xn^X^       Zx^ 
where,  in  the  last  fractional  form,  the  summation  extends  to  all 
pairs  of  associated  values  of  x  and  y.     If  o-j,  cto  be  the  standard 
deviations  of  x  and  y  respectively, 

^x'  =  Na{',  If  =  .Yo-./. 

Let  Ixy  =  X^r  a^a.. 

Then  h  =  r  ~  ,  and  the  equation  of  the   line  of  regression  of 


y  on  X  IS 


(To 

r  — -  X 
0-1 


?/  X 

or  -  =  ?•  — . 

a-2         o-i 


154  CORRELATION  [CH. 

Similarly  the  line  of  regression  of  ^  on  y  may  be  shown  to 
have  the  equation 

x  =  r  ~y, 

0'2 


or 


^=r^ 


If  the  measures  x,  y  be  referred  to  some  other  zero  than  the 
mean,  and  x,  y  be  their  mean  values,  the  equations  of  the  lines  of 
regression  given  above  are  changed  into 

y-y  =  r-{x-x), 

x-x  =  r^{y-y). 
In  these  equations  r  is  a  quantity  defined  by  the  equation 

This  quantity  r  is  called  the  coefficient  of  correlation. 

70.  Returning  to  the  function  V  defined  above,  we  may 
now  ^vrite 

V=l,n,{bX-y,)\ 

where  the  summation  extends  to  all  the  ^--arrays, 
or  F=  S  {hx  -yr-=l  (r  ^  x  -  y)\ 

where    the    summation   extends   to    all   the   pairs   of   associated 
deviations. 

Hence  V='^  tx'  -  — ^  ^xy  +  2v^ 

=-Na.^{l-r% 
If  52  be  the  M.s.E.  due  to  taking  the  value  of  y  given  by  the 
line  of  regression 

r<T., 

y  =  — X 

instead  of  the  measured  deviation  of  y,  then 
V=Ns2'  =  ]}ia,-{l-r% 
and  52  =  0-2  (1  -  r2)i 
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Similarly  if  5i  be  the  M.S.E.  of  x  derived  from  the  equation 

ro-i 

^  =  —  y, 

then  5i  =  o-i(l  —  r-)^. 

If  r  =  1,  then  V=  0,  and  s^  =  5o  =  0.     And  since 

V^libx-yf 

it  follows  that  when  ?^  =  1, 

hx  —  y  =  0 

for  each  pair  of  associated  deviations  x,  y.  In  other  words,  a 
linear  relation 

is  then  rigorously  satisfied  by  all  pairs  of  values  of  x,  y. 

If  the  coefficient  r  only  differs  slightly  from  unity,  the  points 
in  the  scatter  diagram  are  closely  grouped  about  a  straight 
line.  The  two  lines  of  regression  (which  coincide  when  r  =  \)  are 
then  inclined  to  one  another  at  a  small  angle. 

If  ?'  be  small  the  angle  between  the  lines  of  regression  is  large, 
and  when  r  =  0,  the  lines  are  at  right  angles,  their  equations  being 
2/  =  0,  «=  0.     In  the  case  where  r  is  small,  the  M.S.E.  of  y  caused 

by  our  adopting  the  linear  relation  y  =  — '^  x   between   the   two 

variables  instead  of  using  the  original  observations,  defined  above 
as  S2,  is  nearly  as  great  as  a.,.  Or  in  other  words,  if  we  want  to 
find  the  value  of  //  corresponding  to  a  given  value  of  x,  the  value 

— ^"  X  is  only  a  verv  slight  improvement  on  the  mean  value  of  all 

the  2/'s  in  the  case  where  r  is  small. 

When  r  is  small,  there  is  only  very  slight  correlation  between 
the  two  variable  characters  considered,  and  it  seems  doubtful 
whether  any  serious  meaning  can  be  attached  to  values  of  r  which 
are  less  than  "5. 

When  r  =  0,  there  is  apparently  no  correlation  between  x  and  y, 
and  the  lines  of  regression  are  at  right  angles  to  one  another. 
In  the  next  diagram  (fig.  11),  however,  is  shown  an  extreme  case 
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where  r  =  0,  while  the  variables  are  connected  by  a  clearly  marked 
relation.  It  is  therefore  not  safe  to  assume  a  complete  absence  of 
correlation  in  cases  where  the  coefficient  r  is  very  small.  The 
evaluation  of  the  correlation  ratio,  which  will  be  discussed  later, 
affords  a  better  test  of  correlation  in  such  cases. 


Fig.  11.     Curve  of  regression  of  ij  on  x,  together  with  the 
corresponding  line  of  regression. 

71.     Evaluation  of  r  for  the  material  of  figure  9. 

In  the  following  table  we  shall  evaluate  r  for  the  material 
which  has  been  represented  diagrammatically  in  figure  9.  The 
first  column  gives  the  orbital  period  in  hours,  and  the  second 
column  the  duration  of  eclipse  in  hours,  for  38  stars.  The  means 
of  these  columns  are  evaluated.  In  the  third  and  fourth  columns 
are  given  the  deviations  from  these  means  of  the  orbital  period 
and  duration  of  eclipse  of  each  star.  These  are  the  quantities 
which  we  have  called  y,  and  oc,  respectively,  in  the  preceding 
discussion.     The  remainder  of  the  table  is  self-explanatory. 

ro-,  0-0=  257-54; 


257-54 


r<T2 


556 

X5-26 

•881 

X  55  6 

5 

•26 

•881 

X  5^26 

=    -881, 


8-325, 


55-6 


•093. 


The  corresponding  lines  of  regression  are 

r  -  82^7  =  8-325  (Z  -  10-4), 

or  F=8^325X-3-88  

and  X-10^4=    •093(F-82^7), 

or  Z=   •093F+2^7 (B). 


(A), 


X] 


COREELATION 


157 


Orbital 

Duration 

period 

of  eclipse 

y 

X 

y'^ 

a;- 

X7J 

13-2 

2-7 

-69-5 

—  7-7 

4830 

59-3 

+ 
535-2 

— 

15-1 

3-5 

-67-6 

-6-9 

4570 

47-6 

466-4 

20-1 

5-1 

-62-6 

-5-3 

3919 

28-1 

341-8 

21-4 

5-0 

-61-3 

-5-4 

3758 

29-2 

331-0 

21-6 

5-0 

"61-1 

-5-4 

3733 

29-2 

329-9 

27-3 

5-0 

-55-4 

-5-4 

3069 

29-2 

299-2 

28-7 

5-5 

-54-0 

-4-9 

2916 

24-0 

264-6 

32-6 

4-7 

-50-1 

-5-7 

2510 

32-5 

285-6 

45-5 

5-0 

-37-2 

-5-4 

1384 

29-2 

200-9 

45-8 

6-5 

-36-9 

-3-9 

1362 

15-2 

143-9 

47-4 

6-5 

-  35-3 

-3-9 

1246 

15-2 

137-7 

55-8 

12-0 

-26-9 

1-6 

724 

2-6 

43-0 

58-0 

10-4 

-24-7 

0 

610 

0 

0 

59-8 

10-0 

-22-9 

-    -4 

524 

-2 

9-2 

65-4 

8-0 

-17-3 

-2-4 

299 

5-8 

41-5 

66-4 

8-0 

-16-3 

-2-4 

266 

5-8 

39^1 

66-4 

7-9 

-16-3 

-2-5 

266 

6-3 

40-8 

68-8- 

12-0 

- 13-9 

1-6 

193 

2-6 

22-2 

68-8 

10-0 

-13-9 

-    -4 

193 

-2 

5-6 

71-9 

8-0 

-10-8 

-2-4 

117 

5-8 

25-9 

73-3 

11-1 

-   9-4 

-7 

88 

-5 

6-6 

79-3 

12-0 

-   3-4 

1-6 

12 

2-6 

5-4 

79-6 

11-8 

-    31 

1-4 

10 

2-0 

5-3 

81-1 

13-1 

-    1-6 

2-7 

3 

7-3 

4-3 

82-8 

10-5 

-1 

-1 

0 

0 

0 

82-8 

9-7 

-1 

—    -7 

0 

-5 

-1 

94-9 

10-0 

12-2 

-    -4 

149 

•2 

4-9 

95-8 

9-2 

13-1 

-1-2 

172 

1-5 

15-7 

106-2 

10-3 

23-5 

-    -1 

552 

0 

2-4 

109-5 

11-8 

26-8 

1-4 

718 

2-0 

37-5 

110-4 

14-0 

27-7 

3-6 

767 

13-0 

99-7 

115-3 

17-2 

32-6 

6-8 

1063 

46-2 

221-7 

142-4 

15-2 

59-7 

4-8 

3564 

23-0 

286-6 

144-1 

19-0 

61-4 

8-6 

3770 

74-0 

528-0 

164-7 

26-0 

82-0 

15-6 

6724 

243-4 

1279-2 

202-3 

20-0 

119-6 

9-6 

14303 

92-2 

1148-2 

227-6 

21-5 

144-9 

11-1 

20996 

123-2 

1608-4 

250-3 
Sums : 

17-5 

167-6 

7-1 

28090 

50-4 

1190-0 

3142-4 

400-7 

117470 

1050-0 

9787- 

7 

Means : 

82-7 

10-4 

3091-3 

27-63 

257- 

54 

0-2=55-6    0-1  =  5-26 
The  two  straight   lines  are  represented  in  figure  9   by  the 
continuous  and  dotted  lines  respectively. 

The  mean  square  error  in  y  due  to  assuming  the  relation  A  is 
55-G  v/l- (-881)2  hours  _  26-4  hours, 
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and  the  mean  square  error  in  x  due  to  assuming  the  relation  B  is 
5-26  s/l- (-881)2  hours  =  2-5  hours. 


72.     Calculation  of  r  from  the  Contingency  Table. 

When  the  number  of  observations  is  large,  the  method  of 
calculating  r  used  in  the  preceding  table  becomes  extremely 
laborious.  It  is  then  better  to  calculate  r  from  the  contingency 
table.  To  illustrate  the  method,  we  shall  evaluate  r  from  the 
contingency  table  of  page  151. 


Duration  of 

eclipse. 

2— 

4— 

6— 

8— 

10— 

12— 

14— 

16— 

18— 

20— 

22— 

24— 

Totals 

10— 

s^ 

6^ 

8 

40— 

31 

,  3 

2 

1 

„3 

.2 

11 

o 

70— 

o3 

0* 

2 

0 

9 

100— 

o2 

2II3I 

4 

1 

o 

130— 

^^ 

si 

2 

160— 

21 1 

1 

190— 

! 

»i 

1 

220— 

15  1 

.1 

2 

Totals 

2 

i 

3 

5 

9 

4 

2 

2 

1 

2 

— 

1 

38 

The  above  table  is  a  repetition  of  the  table  of  page  151,  with 
the  arrays  which  contain  the  means  of  the  two  variables  marked 
by  thick  lines.  The  midpoints  of  these  two  arrays  are  taken  as 
zeros  for  the  coordinates,  and  the  class-intervals  are  taken  as  units 
in  terms  of  which  the  s.  D.'s  of  x  and  y  are  to  be  expressed.  The 
following  tables  give  the  calculation. 

Each  of  the  tables  (a)  and  (6)  gives  the  calculation  of  the 
mean  and  of  the  s.  D.  for  one  of  the  variables.  For  still  greater 
accuracy,  we  might  apply  Sheppard's  corrections  to  the  values  of 
cTj  and  (Jo. 
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Sum 

Factor 

1st  product 

2nd  product 

2 

-4 

-8 

32 

7 

-3 

-21 

63 

3 

-2 

-6 

12 

5 

-1 

-5 

5 

9 

0 

0 

0 

4 

1 

4 

4 

2 

2 

4 

8 

2 

3 

6 

18 

1 

4 

4 

16 

2 

5 

10 

50 

0 

6 

0 

0 

1 

7 

7 

49 

38 

—  5 

257 

Means    ... 



-13 

6-763 

„.,2  =  6-763-(-13)2  =  6-746.  (r,  =  2-6. 

(6)     Fm-  y. 


Sum 

Factor 

1st  product 

2nd  product 

8 

—  2 

-16 

32 

11 

-1 

-11 

11 

9 

0 

0 

0 

4 

1 

4 

4 

2 

2 

4 

8 

1 

3 

3 

9 

1 

4 

4 

16 

2 

5 

10 

50 

38 

-2 

130 

Means   ... 



-•05 

3-43 

0-22  =  3-43 -(•05)2  =  3-427.  0-2=  1-85. 

These  tables  yield  almost  exactly  the  same  values  of  a^  and  cto 
as  were  derived  from  the  ungrouped  material  of  observation. 

Thus  o-j  =  2-6  units  =  5*2  hours, 

cTo  =  1-85  units  =  1*85  x  30  hours  =  55*5  hours. 
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In  order  to  complete  the  calculation  of  r,  we  must  calculate 
the  product-moment  1.xy.  The  factors  corresponding  to  xy  are 
the  small  figures  given  in  the  lower  left-hand  corners  of  the 
compartments  of  the  table.  In  this  particular  example  the  sum 
of  all  the  moments  from  any  one  quadrant  can  easily  be  evaluated. 
The  result  may  be  represented  thus  : 


-1-63 


Total  159. 


The  product-moment  thus  obtained  is  not  about  the  means  of 
the  two  variables,  and  must  be  corrected. 

ra.a.^-V'lZx  •05  =  '^  =  4T9, 

7^0-10-0  =  4-183, 

.  =  -8  /  0. 


2-6  X  1-85 

Thus  the  value  of  r  yielded  by  this  method  is  "870  as  com- 
pared with  '881  obtained  by  working  with  the  original  ungrouped 
material. 

When  the  number  of  observations  represented  in  the  con- 
tingency table  is  large,  the  product-moment  cannot  be  obtained 
quite  as  simply  as  was  done  above.  It  may  then  be  advisable 
to  write  in  one  column  all  the  factors  which  are  effective, 
writing  the  corresponding  frequencies,  with  due  regard  to  sign, 
in  a  second  column.  The  remainder  of  the  calculation  is  then 
easy. 

73.     The  Correlation  Ratio. 

If  the  curve  of  regression  be  not  linear,  r  cannot  be  regarded 
as  a  satisfactory  measure  of  the  amount  of  correlation  between 
the  two  characters  considered.  We  then  have  to  find  a  method 
which  will  decide  whether  the  observations  are  clustered  closely 
about  the  curve  of  regression,  or  are  widely  scattered.  The 
obvious  method  is  to  evaluate  the  S.D.  of  each  array. 

Let  Sax  be  the  S.D.  of  any  ^-array.     Then  once  the  quantities 
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Sax  have  been  evaluated  for  all  the  arrays,  we  can  tell  accurately 
how  closely  the  individual  points  in  the  scatter  diagram  are 
clustered  about    the  curve  of  regression.     A  curve  representing 

o 

-^  is  called  a  scedastic  curve.     The  mean  ordinate  of  such  a  curve 

is  a  measure  of  the  closeness  with  which  a  given  value  of  ^  yields 
a  definite  value  of  y.  In  practice,  however,  it  is  more  convenient 
to  evaluate  the  s.D.  of  the  weighted  s.D.'s  of  the  separate  ^-arrays. 
Let  this  s  d.  be  aax- 

Then  o-^     -  ^^^^'^^ 

It  is  found  that  a'ax  can  be  calculated  without  going  through 
the  process  of  evaluating  Sax  fo^^  each  ^-array. 

Let  My  denote  the  mean  of  all  ys,  lUy  the  mean  of  any  ^-array* 
whose  s.D.  is  s^x-     Summing  for  one  array,  we  find 

%{y-Myr  =  ^{y-  myY  +  n  {My  -  m,y 

=    7lS\^    +    71   {My    -    myY. 

Summing  for  all  arrays,  we  find 

JVcr.r  =  Na\x  +  Na'^my, 
where   a^y  is   the   s.D.  of  the   weighted   means   of  the  separate 
^-arrays. 

.'.     cr^ax  =  o-i  -  a^y,,y  =  o-o-  (1  -  7)%j^)  say, 

where  77,^.^  =  — ^  . 

(To 

The  quantity  r]yx  thus  defined  is  called  the  coin-elation  ratio  of  y 
on  X.     There  will  naturally  be  a  second  correlation  ratio,  r]^y,  which 

is  the  correlation  ratio  of  ^  on  y.     Its  value  is  -^. 

The    mean   square    of    the    distance    between    the    curve    of 
regression  and  the  line  of  regression  y  =  r  -  x  is 


1  ^ 

[vx- 

r  ^-^  y-  _  ^'h^h'  1  ^''^2^  -^7/^''     0^  ^2  ^^hjxyx 

~'  a,V            N       '     a,^       N         ^""a,       N 

=  o-'my  +  — f  o"r  —  2?-  ~  ra^ao 

=  cr,'{Tyx-n 

*  An  .T-array  is  an  array  of  ?/'s. 

B.  O.  11 
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Thus  jf  —  r^  is  a  measure  of  the  deviation  from  linearity  of 
the  curve  of  regression.  In  accurate  work  it  is  advisable  to 
calculate  r)  as  well  as  ?%  since  77  is  a  better  measure  of  causal 
relation  than  r,  and  7)^--r^  affords  a  measure  of  the  linearity 
of  the  regression.  It  should  be  noted  that  7)  is  always  greater 
than  r,  except  when  the  regression  is  accurately  linear,  and  in 
this  case  77  =  r.  And  conversely,  if  7;  =  r,  within  the  limits  of 
random  sampling,  the  regression  is  linear. 

It  is  clear  from  the  equation 


1-,,= 


yx 


that  j  7]  I  must  lie  between  0  and  1,  and  since  rj  is  the  ratio  of 
two  s.D.'s  it  must  be  positive.  The  correlation  ratio  therefore 
lies  between  0  and  1  in  all  cases.  When  the  correlation  is 
complete  y  is  unity,  and  when  there  is  no  correlation  between 
the  variables,  rj  is  zero.  When  there  is  a  considerable  amount 
of  correlation  between  the  variables,  77  is  large,  and  when  the 
variables  are  only  slightly  correlated  77  is  small.  But  the  corre- 
lation ratio  only  affords  a  satisfactory  test  when  the  number  of 
observations  is  sufficiently  great  to  permit  of  the  formation  of 
a  grouped  contingency  table. 

74.     Probable  Errors. 

As  all  the  quantities  a,  r,  77,  etc.,  mentioned  above  are  in 
general  calculated  from  a  sample  of  the  total  population  which 
bears  the  characters  considered,  it  is  necessary  to  consider  the 
probable  errors  of  all  these  quantities.     Remembering  that 


/n  -  1 

we  may  write 

P.E.  of  mean  =  "6745  -^  =  '6745  —^    (approximately), 

P.E.  of  o-  =  '6745       --  =  "6745  ~=^  (approximately). 

v2(?i— 1)  v2?? 

The  P.E.  of  r  ma}'  be  taken  to  be 


X]  CORRELATION  163 

in  cases  where  the  frequency  distribution  is  approximately  normal, 
and  n  is  large. 

When  the  regression  is  linear,  or  nearly  linear,  the  RE.  of  rj  is 

•6745^-"' 


These  formulae  are  based  upon  certain  definite  assumptions 
concerning  the  relations  between  the  variables  considered.  It  can 
be  shown  that  if  the  variations  of  x  and  y  from  their  mean 
values  are  due  to  a  number  of  independent  sources  of  error, 
some  of  which  are  common  to  both,  while  each  of  the  elementary 
errors  follows  a  normal  law  of  distribution,  then  the  number  of 
cases  in  which  x  lies  between  x  and  x  +  dx,  while  y  lies  between 
y  and  y  +  dy,  will  be  f{x,  y)dxdy,  where 


1  /.r-^^  y^  J^xur\ 

f{x,y)  =  Ae    '^d-^')    V'^r    <T2^      <T,<rJ 


It  will  be  noted  that  this  law  of  distribution  requires  that 
"both  X  and  y,  when  considered  independent!}^,  should  follow  a 
normal  law  of  distribution. 

The  P.E.'s  of  r  and  t]  given  above  are  calculated  by  the  use  of 
the  formula  here  given  for  f{x,  y).  The  methods  of  correlation 
are  frequently  applied  to  variables  which  do  not  follow  a  normal 
law  of  variation,  and  the  probable  errors  of  r  and  rj  are  still 
calculated  by  the  use  of  the  formulae  given  above.  It  must 
be  remembered  that  in  such  cases  the  formulae  cannot  yield 
more  than  a  rough  approximation  to  the  values  of  the  probable 
errors. 

75.  To  calculate  rj  for  the  material  of  the  table  on 
page  151. 

The  following  is  a  rough  estimate  of  the  value  of  ?/.  It  is 
necessary  to  evaluate  niy  for  each  ^--array.  For  the  purpose  of 
a  rough  calculation,  it  is  sufficient  to  assume  the  mean  value 
of  y  for  the  observations  in  any  one  compartment  to  be  the 
central  value  for  that  compartment.  The  value  of  iiiy  is  then 
easily  evaluated.  The  mean  value  of  all  the  ys>  has  already  been 
shown  to  be  82"7.     This  is  the  quantity  called  My. 

11—2 
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X 

n 

Wy 

My   -  My 

(My-my)^ 

n  (My  -   Vly)^ 

2—  4 

2 

25-0 

-57-7 

3329 

6658 

4—  6 

7 

29-3 

-53-4 

2851 

19957 

6—  8 

3 

55-0 

-27-7 

765 

2295 

8—10 

5 

73-0 

-   9-7 

94 

470 

10—12 

9 

81-7 

-    1-0 

1 

19 

12—14 

4 

70-0 

-12-7 

161 

644 

14—16 

2 

130-0 

47-3 

2237 

4474 

16-18 

2 

175-0 

92-3 

8519 

17038 

18—20 

1 

145-0 

62-3 

3581 

3581 

20—22 

2 

220-0 

137-3 

18851 

37702 

22—24 

— 

— 

— 

— 

— 

24-26 

1 

175-0 

92-3 

8519 

8519 

38 

101357 

101357     ^__ 

o-^«r,,  =  — ^T7^—  =266/. 


my  oo  -uKJKji.  ^  my 

_  ^my  _  51'6 

0-2       55  6 
The  value  of  r  was  found  to  be  '881. 


51-6. 


•928. 


76.     Spearman's  Formulae.     The  Method  of  Ranks*. 

In  the  method  of  ranks  the  actual  measurements  are  replaced 
by  the  numbers  representing  the  ranks  of  measurements  when 
they  are  arranged  in  order  of  magnitude.  Spearman  applied 
the  method  to  the  consideration  of  the  correlation  of  psychical 
performances  in  individuals.  If  iV  individuals  be  considered,  and 
Vi,  V2  be  the  orders  of  merit  of  any  one  individual  in  the  two 
series  of  measurements,  the  product-moment  reduces  to 

6X  (v,  -  v,y  _ 

Spearman  also  suggested  a  simpler  formula,  for  the  product- 
moment, 

where  2c?  is  the  sum  of  gains  in  rank  (or  the  sum  of  positive 
differences)  of  the  second  series  on  the  first. 

The  method  of  deduction  of  these  formulae  assumes  that  the 


*  Amer.  Jour.  Psych.,  Vol.  xv,  1904;  Brit.  Jour.  Psych.,  Vol.  ii,  part  1. 
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frequency  distribution  can  be  represented  by  a  rectangle.  Pearson  * 
has  shown  that  if  the  frequency  distribution  is  Gaussian,  the 
quantities  p  and  R  are  connected  with  r  by  the  relations 


2  sin  (  -  /? 


2  cos  ^(l-R)-l. 


When  the  frequency  distribution  is  not  Gaussian,  it  is  not 
clear  how  r  is  to  be  deduced  from  p  or  R.  For  this  reason,  the 
above  formulae  are  not  to  be  recommended  for  use  in  accurate 
work.  A  further  disadvantage  is  that  the  method  does  not  yield 
the  values  of  the  standard  deviations.  The  formulae  may  be 
useful  however  in  rough  work  involving  only  a  small  number 
of  observations. 

In  order  to  compare  the  results  yielded  by  Spearman's 
formulae  with  the  result  deduced  from  the  product-moment 
formula,  we  shall  apply  these  formulae  to  the  example  of 
page  194.  The  stars  are  arranged  in  order  of  magnitude  of 
orbital  period  in  the  table  of  page  194.  The  positions  of  these 
stars  in  a  table  arranged  in  order  of  magnitude  of  eclipse- 
duration  are  given  in  the  second  column  below. 


n 

"2 

d 

d2 

"1 

^2 

d 

d2 

1 

1 

0 

0 

20 

13 

7 

49 

2 

2 

0 

0 

21 

24 

-3 

9 

3 

8 

-5 

25 

22 

27 

-5 

25 

4 

4 

0 

0 

23 

25 

-2 

4 

5 

4 

1 

1 

24 

30 

-6 

36 

6 

4 

2 

4 

25 

23 

2 

4 

7 

9 

-2 

4 

26 

17 

9 

81 

8 

3 

5 

25 

27 

18 

9 

81 

9 

4 

5 

25 

28 

16 

12 

144 

10 

10 

0 

0 

29 

21 

8 

64 

11 

10 

1 

1 

30 

25 

5 

25 

12 

27 

-15 

225 

31 

31 

0 

0 

13 

22 

-9 

81 

32 

33 

-1 

1 

14 

18  • 

-4 

16 

33 

32 

1 

1 

15 

13 

2 

4 

34 

35 

-1 

1 

16 

13 

3 

9 

35 

38 

-3 

9 

17 

12 

5 

25 

36 

36 

0 

0 

18 

27 

-9 

81 

37 

37 

0 

0 

19 

18 

1 

1 

38 

34 

4 

16 

Drapers^  Company  Research  Memoirs,  Biometric  Series  iv. 
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.\R  =  1 
p  =  l 


38^-1 
6 . 1077 


38  (38^  -  1) 
Using  the  formulae  given  above,  we  find 


071  =  -929, 

■882.    ■ 


r  =  2  sin    -  X  -882 


TT 


5]  =  -891, 


TT 


r  =  2cos-(-071)-l  =  -994. 
s 

The  value  derived  for  r  by  the  use  of  the  product-moment 
formula  is  "881. 

77.     The  Method  of  Contingency. 

The  methods  discussed  above  are  only  applicable  in  cases  where 
the  characters  considered  are  capable  of  quantitative  measurement. 
Such  characters  would  be  length,  time,  stellar  magnitude,  wages, 
etc.  When  the  grouping  in  the  contingency  table  is  purely 
classificatory,  the  different  classes  bear  no  quantitative  relation  to 
each  other.  Thus  if  stars  be  grouped  according  to  a  colour  scale, 
the  scale  cannot  be  regarded  as  an  arithmetical  scale.  Except  for 
this  difference  in  the  nature  of  the  scale,  tables  which  represent 
the  distribution  of  two  characters,  of  which  one  or  both  may  be 
purely  qualitative,  are  of  precisely  the  same  form  as  the  con- 
tingency table  given  on  page  151.  The  following  is  an  example 
of  such  a  contingency  table*,  representing  the  distribution  of 
1360  stars  according  to  spectral  type  and  colour.  The  groupings 
a,  b,  c,  d,  e,  f  are  according  to  a  scale  of  colour,  where 

stars  of  class  a  are  white, 

h     „   white  with  faint  tinge  of  colour, 
c     „   very  pale  yellow, 
d    „   pale  yellow, 
e     „    full  yellow, 
/    „    ruddy. 


*  W.  S.  Franks,  "On  the  Relation  between  Star  Colours  and  Spectra, " If ont/tZy 
Notices,  R.A.S.,  Vol.  lxvii,  p.  541,  Table  IV. 
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Spectral  Type 

a 

f> 

c 

d 

e 

/ 

Total 

Helmon  Stars 

125 

(61-38) 

146 
(103-05) 

8 

(44-37) 

3 

(37-74) 

0 

(33-18) 

0 

(2-28) 

282 

Hydrogen  Stars 

168 

(82-05) 

195 

(137-77) 

14 

(59-32) 

0 

(50-45) 

0 

(44-35) 

0 

(3-05) 

377 

a  Carinae  Type 

3 

(29-82) 

97 

(50-07) 

23 

(21-56) 

8 

(18-33) 

6 

(16-12) 

0 

(1-11) 

137 

Solar  Stars 

0 

(39-18) 

41 

(65-78) 

(28-32) 

33 

(24-09) 

29 

(21-18) 

0 

(1-46) 

180 

Arctiiriis  Tyjje 

0 

(52-45) 

15 

(88-07) 

86 

(37-92) 

77 
(32-25) 

63 

(28-35) 

0 

(1-95) 

241 

Aldebaran  Type 

0 

(16-32) 

0 

(27-41) 

4 

(11-80) 

22 

(10-04) 

43 

(8-82) 

6 

(-61) 

75 

Betelgeuse 

0 

(14-80) 

3 

(24-85) 

2 
(10-70) 

214 

39 

(9-10) 

19 

(8-00) 

5 

(-55) 

68 

Total 

296 

497 

182 

160 

11 

1360 

Leaving  out  of  consideration  for  the  moment  the  figures  in 
brackets  in  each  compartment,  we  have  a  contingency  table 
resembling  that  of  page  151.  Even  a  casual  examination  of  this  table 
shows  that  there  is  a  tendency  for  stars  low  down  in  the  scale  of 
spectra  to  have  colours  which  are  nearer  to  red  than  those  of  stars 
higher  in  the  scale  of  spectra  ;  i.e.  there  is  some  degree  of  correla- 
tion between  colour  and  spectral  type  of  stars. 

The  last  row  in  the  table  gives  the  distribution  according  to 
colour,  and  the  last  column  the  distribution  according  to  spectral 
type,  of  all  the  stars  considered.  If  spectral  type  and  colour  were 
independent  factors,  the  number  of  stars  in  any  compartment  of 
the  table  could  be  evaluated  from  the  totals  in  the  last  row  and 
the  last  column ;  e.g.  if  colour  were  independent  of  spectral  type, 
the  number  of  stars  of  colour  c  and  of  solar  type  would  be 


214 
1360 


X  180,  or  28-32. 


The  numbers  which  should  occur  in  the  other  compartments  can 
be  evaluated  in  the  same  way.  These  numbers  are  entered  in 
the  compartments  of  the  table,   in  brackets.     These  bracketed 
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figures  give  the  distribution  to  be  expected  if  there  were  no 
correlation  between  spectral  type  and  colour.  The  differences 
between  the  two  sets  of  figures  are  due  to  correlation  between  the 
two  characters. 

It  is  customary  to  calculate  the  amount  of  correlation  in  the 
following  way : 

Form  the  difference  between  the  bracketed  and  unbracketed 
figures  in  each  compartment;  square  this  difference  and  divide 
the  result  by  the  corresponding  bracketed  figure.  Then  add 
together  all  the  numbers  obtained  in  this  manner,  and  divide  the 
sum  by  the  total  number  of  stars.  This  result  is  called  the  total 
mean  square  contingency,  and  is  denoted  by  cj)^. 

For  the  example  shown  in  the  above  table, 

1  sfiOrf>^  -  a25- 61-38)'     (146-]03-05y 

there  being  42  terms  in  all ; 

(^2  =  1-021. 
The  function  c  defined  by 


~V  1  + 


4>' 

is  called  the  coefficient  of  mean  square  contingency.  The  quantity  c 
is  of  the  same  nature  as  the  coefficient  of  correlation  r.  When 
the  frequency  distribution  is  normal,  it  can  be  shown  that  c 
calculated  as  above  is  equal  to  r.  It  is  clear  from  its  definition 
that  c  must  lie  between  —  1  and  -f  1.  When  there  is  no  correla- 
tion between  the  two  characters  considered, 

(^  =  0  and  c  =  0. 

Small  values  of  (/>  and  c  indicate  a  small  amount  of  correlation. 
For  the  table  given  above 

c  =  -n, 

a  value  which  may  be  taken  to  indicate  a  close  relationship  between 
spectral  type  and  colour. 

For  a  full  treatment  of  the  subject  of  contingency  the  reader  is  referred  to 
the  original  memoir  of  Prof.  Karl  Pearson,  "  On  the  Theory  of  Contingency 
and  its  Relation  to  Association  and  Normal  Correlation,"  Drapers^  Company 
Research  Memoirs,  Biometric  Seiies  i,  1904,  Dulau  and  Co. 
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-     Example.     Evaluate  the  s.d.  of  a  linear  function  of  a  number  of  cor- 
related variables. 

Let  F=ax-\-hy-\-cz-{-... 

be  the  function. 

Then  F^-  =  a^x"-  +  bY  +  c^^^  +  ^abx?/  +  2acxs  +  ^bci/z  +  . . . . 

It  follows  that 

0-^2  =,  ^20-^2  +  52j^^2  ^_  ^20-^2  ^_  2^6  r^^y  o^x  (Ty  +  ^ac  i\^  (T^  cr^  +  2bc  Ty,  (ry(r,+  .... 

A  special  case  is  a^x-y  =  ^x  +  ^y' -  ^^''xy (^x ^y • 

This  affords  a  method  of  evaluating  r^j,  by  evaluating 

78.     The  Meaning  of  the  Correlation  Coefficient. 

The  meaning  which  we  shall  assign  to  the  correlation  coefficient 
will  to  some  extent  depend  upon  the  point  of  view  from  which  we 
approach  the  question.  Referring  back  to  §  70  we  see  that  if  the 
coefficient  of  congelation  r  between  two  variables  is  known,  the 
value  of  X  corresponding  to  any  given  value  of  y  can  be  given  with 
a  M.s.E.  S2  =  ^2  v/l  -  t'",  0*2  being  the  M.s.E.  of  all  the  i/'s.  Our  know- 
ledge of  the  correlation  r  thus  enables  us  to  reduce  the  M.s.E.  of 
our  estimate  of  the  value  of  y  from  0-3  to  a.2  Jl  —  r^.  From  this 
point  of  view  it  would  appear  preferable  to  measure  correlation  by 
Jl  —  r"^  rather  than  by  r. 

It  is  customary,  however,  to  regard  r  as  being  in  some  way 
a  measure  of  the  number  of  common  causes  which  underlie  the 
variations  of  the  two  quantities  considered.  The  following  simple 
case  considered  by  Kapteyn  may  help  to  make  this  point  clear. 
Let  the  variations  of  x  and  y  be  due  to  a  number  in  +  n  of 
elementary  causes,  m  of  these  causes  being  common  to  both  x 
and  y,  while  the  remaining  n  causes  are  independent.  Then  we 
may  write 

x  =  A^oi^  +  Aoa.  -I- ...  +  Ay,,a,^  -H  B,^,  +  5,^,-h  ...  +  B^Pn, 
y=  n,a,  +  R,7.o  +  ...  -f  Ana»^+  Ci7i  +  C'07,  +  ...  +  C^y,,, 

where  a^,  a^,  ...,  jSi,  P2,  •••,  7i,  72,  •••,  are  all  independent  of  one 
another.     We  shall  further  simplify  the  problem  by  supposing  that 
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the  M.  s.  E.  introduced  into  x  or  3/  by  any  one  of  the  independent 
elementary  causes  is  unity.     Then 

in  n 

1  1 

m  n 


y  =  ^G(s  +  ^% 


Si 


where  the  M.s.E.  of  each  variable  a,  /?,  or  7,  is  unity.     Then  if  e^,  €y 
be  the  m.s.e.  of  x  and  y  respectively, 


111 

m  n  n 

(o)  +  yY  =  42a/  +  l/3s'  +  S7/  +  4Sa,af 
111 

+  2l^,0t  +  ^tysjt  +  42a,^,  +  4Xa,7, 
+  2S^,7e. 
If  €x+y  be  the  m.s.e.  of  ^  4- y,  then  the  mean  value  of  the  R.H.S. 
of  the  above  equation  is  equal  to  e^^^y ; 

m  ^  _  n 

.-.     e\+y  =  42«/  +  Sy^/  +  27/  =  4m  +  71  +  n, 
111 

the  product  terms  all  vanishing. 

The  last  equation  may  be  written 

4^m  +  2n 


x+y 


=  2m  +  2n  +  2r  (m  +  n)  ; 

m 
r  = 


m  +  n 


Thus  in  this  particular  case  r  measures  the  proportion  of 
elementary  causes  of  variation  which  the  two  variables  have  in 
common. 

*  Vide  Example,  page  169. 


CHAPTER  XI 

HARMONIC    ANALYSIS    FROM    THE    STANDPOINT    OF 
LEAST    SQUARES 

79.  To  find  the  Amplitude  of  a  Single  Periodic  Term 
of  Known  Period. 

For  the  sake  of  definiteness  we  shall  consider  the  problem 
of  finding  the  amplitude  of  the  annual  period  in  temperature 
records,  from  the  mean  monthly  temperature.     Let 

f'O}     ^1>     ^2)     '••>     ni 

be  the  mean  temperatures  for  each  of  twelve  months,  starting  with 
January.  Let  the  time  be  measured  from  the  middle  of  January, 
and  let  the  length  of  the  month  (assumed  constant)  be  a.  Then 
the  temperatures 

are  associated  with  the  times 

0,  a,  2a,  ...,  11a. 
If  a  be  the  required  amplitude,  we  can  write 
1=  Ao  +  acos(d  —  (/)), 
where  I  takes  the  value  Ij.  when 

The  variable  6  passes  through  the  range  0  to  27r,  while  the 
time  passes  through  the  range  0  to  12a.  Expanding  the  trigono- 
metric term,  we  find 

1  =  Ao-\-  Ajcos0+  B^sinO (1), 

where         A^  =  a  cos  </>,     B^  =  a  sin  (/>,     a  =  (A f  +  ^r)-. 
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Giving  6  the  values  0,  a,  2a,  ..,,  11a,  in  turn,  we  obtain 
12  observational  equations  to  be  solved  by  the  method  of  least 
squares.     The  normal  equations,  formed  by  the  usual  method,  are 

r  =  ll 

^  {ly  —  ^0  —  -4i  cos  Or  —  Bi  sin  6r]  =  0, 

r-=0 
r=il 

2  {Ir  —  Aq—A-^^  cos  6r  —  Bi  sin  6,]  cos  0,.  =  0, 

r=0 

r=ll 

S  {^y  —  Aq—  Ai COS  ^r  —  ^1  sin  6.,.}  sin ^^  =  0. 

r=0 

Now  it  can  be  easily  shown  that* 

2  cos  Or  =  ^  smdr  =  ^  COS  ^^  siu  6r  =  0, 
2cos^6>,=  ^2(l  +  cos2(9,)  =  6, 
S  sin'^  6^,  =  12  (1  -  cos  2(9,)  =  6. 

*  {cosa  +  cos(a  +  /3)  +  cos(a  +  2/3)+  ... +cos  (a  +  7i- 1/3)}  x2sm^/3 

=  { sin  (a  +  i/3)  -  sin  (a  -  I /3) }  +  { sin  (a  +  f  /3)  -  sin  (a  +  J /S) } 

+  {sin(a  +  f/3)-sin(a  +  f/3)}  +  ... 


=  sin  (°  +  — 2~  ^)  ~  ^^^  ^"^  ~  ^''^^ 

/       n  -I  A    .    nl3 
=  2  cos  (  ci-f  — 2"  /S  J  sm  —  . 

.-.    cosa  +  cos  (a  +  /3)-t-cos(a  +  2/3)  + 


{sin(a.?^V)-sin(..?^S)[ 


+  cos 

[a+n 

-1^): 

=  cos 

(- 

'-^.' 

\ 

similarly 

sin  a  +  sin 
+  sin 

(a  +  /3) 

4-sin(( 

:i/3). 

a +  2/3) 
=  sin  ( 

+  ... 
a+- 

^^^) 

\ 

In    the    series    S  cos  ^,.,    2sin^,.,     7i  =  12,    /3=— ,    -^  =  7r,    and    the    factor 
sin-^  vanishes. 

In    the    series    Ssin2^,.,    Scos2^,.,    /3  =  ^,    ^  =  2ir,    and    the   factor    sin -^ 
again  vanishes. 
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Substituting  these  values,  we  find* 
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r=ll  \ 

-4o  =  Y2   S    Ir 
r  =  0 

r  =  ll 

^1  =  i    S  /,.cos  Ory 

r=0 
7'  =  11 

Z?i  =  1   ^   /,.  sin  6r 

r  =  0 


.(2). 


Equations  (2)  are  the  normal  equations  for  determining  the 
values  of  the  constants  A^,  A-^,  B^.  The  evaluation  can  easily  be 
carried  out  as  follows :  Write  in  a  column  the  values  of  I,  in  a 
second  column  the  values  of  cos  6,  in  a  third  the  values  of  sin  0. 
Multiply  the  second  and  third  columns  by  the  first,  writing  the 
results  in  the  fourth  and  fifth  columns.  From  the  sums  of  the 
first,  fourth,  and  fifth  columns,  we  can  easily  deduce  the  values 
of  the  constants.  The  headings  of  the  different  columns  in  the 
table  will  be 

I,    cos  6,    sin  6,    I  cos  6,    I  sin  6. 

This  method,  although  it  leads  to  the  values  of  the  constants, 
is  not  the  shortest.  On  page  183  will  be  found  a  form  of  calcula- 
tion which  will  shorten  the  work  when  there  are  12  readings. 

Equations  (2)  above  can  be  immediately  generalised  for  the 
case  of  n  equidistant  readings, 

nA.  =  %l 


A^=^lcos  0 
B,  =  ll  sine 


.(3). 


80.     General  case  involving  more  than  one  Period. 

Let  ^0.  ^1,  hy  •",  ln-\  be  n  observed  values,  which  are 
associated  with  equidistant  values  of  some  argument  (say  time). 
To  fix  ideas  let  us  suppose  the  measurements  l^,  li,  etc.,  to  be 


Compare  Example  3,  page  86. 
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made  at  times  U,  U-V  a,  t^^  2ol,  ...,tQ-\-n  —  loL.  Then  it  will  be 
possible  to  represent  the  values  4,  l-^,  etc.,  by  means  of  a  trigo- 
nometric series,  thus, 

Z  =  J-o  +  -^1  cos  ^  +  ^2  cos  2^  +  ...  I 

+  B,  sin  d-^B.  sin  2(9+  ...  J  ^  ^' 

where  I  takes  the  value  ly  when  6  =  r  — ,  and  r  can  take  all  posi- 
tive integral  values  from  0  to  n  —  1. 

If  in  equation  (4)  we  include  n  constants,  then  by  giving 
r  the  successive  values  0,  1,  2,  ...,  7^,-1,  we  obtain  n  equations, 
whose  solution  yields  the  values  of  the  n  constants.  We  shall 
then  have  obtained  an  accurate  representation  of  all  n  obser- 
vations by  means  of  the  series  of  trigonometric  terms  (4).  In 
practice,  however,  this  accurate  solution  is  seldom  required,  and 
the  problem  is  rather  that  of  representing  a  series  of  n  obser- 
vations, with  fair  accuracy,  by  means  of  a  trigonometric  series 
containing  fewer  than  n  constants. 

Such  series  as  are  represented  in  (4)  are  called  Fourier  series. 
It  should  be  noted  that  the  number  of  sine  terms  is  generally 
taken  to  be  either  equal  to,  or  one  less  than,  the  number  of  cosine 
terms.     The  reason  for  this  will  appear  in  the  sequel. 


81.  We  shall  now  consider  the  problem  as  it  arises  in 
practice,  i.e.  how  we  shall  best  represent  a  series  of  n  obser- 
vations by  means  of  a  series 


or 


Z  =*  ^0  +  ^1  cos  ^  +  ^2  COS  2^  -i-  . . .  +  A,,,  cos  mO 
+  B^  sin  ^  -f  ^2  sin  2^  +  . . .  4-  B.^  sin  mO 


Z  =  J-o  +  S  Ai  cos  iO  -i-  2  Bi  sin  i6, 


...(5), 


where  n  >  2m  -f- 1 . 

We  have  7i  observational  equations,  which  we  shall  solve  by 
the  methods  of  least  squares.     The  normal  equations  are 


XI] 


STANDPOINT  OF   LEAST  SQUARES 


175 


r=n-i  r=n-li  =  m  ^TrH 

Z      Ir  —  uAq  —      Z        Z    Ai  COS 

r=0  r  =  0     i  =  l  ^^ 


r  =  n-li^m  ,      27rri       ^ 

z      Z  Bi  sin =  0 

r=0      i=l  ^ 


with  m  equations  of  the  form 


^=^-1  27rrs     ^="-1  ,  27rrs 

2,      6rC0S Z       J-oCOS  

r=0  ^  r=0  ^'t 


r=^-i  2=m  2'Trri        2'Trrs 

—  z      z  ^icos cos 

r=o    1=1  n  n 

—  Z       Z  Bi  sm cos =  0 


y  ...(6). 


where  5  =  1,  2,  . . .,  m, 

and  m  equations  of  the  form 


irrs 


2    Ir  sin 

r=0  ^ 


-     ^     A, 


^irrs 


sin 


r=M-l i=ra 
r=0      t  =  l 


27r?^i    .     27rr5 
cos sm 


n 


n 


r=n-ii=m        ^    ^TTri    .    2'Trrs     ^ 

—     z      z  Bi  sin sm =  0 

,.=0    i=i  n  n 


where 


5=  1,  2,  ...,  m. 


The  series  S  cos ,  S  sin are  of  the  form  of  the  series 

n  n 

27TI 

given  in  the  footnote  on  page  172,  where  /3= ,  so  that 

sm  —  =  sm  7^^  =  0. 
Similarly,  when  5  is  not  equal  to  i, 


r  —  n-1 


irri    .     zttts 


2     cos sin 


=  i     2     -^sm ^ — — ^-f-sin ^^ -r  =  0, 

r=0     [71  »^  J 

since  each  of  the  series  is  zero*.     In  the  same  way  it  can  be 

*  Compare  with  footnote,  page  172. 
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shown  that 

r=n-i        27rri        27rrs      »*=«-i    .     27rri        ^irrs 
2     cos cos  =     z     sm  — -  cos 

=     z     sin sm =  0 

when  5  is  not  equal  to  i. 
When  s  =  i 


2     cos^ =  i     S       l  +  cos— — 1  = 


since  X    cos =  0. 

r=0  ^ 


Similarly 


r=«-i    _     27rrs      ,  r=n-i  /  4t7rrs\      n 


1     sin^ =  i     ^       1-cos  .-^. 

Substituting  these  values  for  the  trigonometric  series  which 
occur  in  the  normal  equations,  we  obtain  the  following  simplified 
form  of  the  normal  equations : 

r=n-l  V 

1  Ir  —  nAo  =  0 

2  ^rcos           --As  =  i), 
r=o  n         2  I (7). 

2     Ir  sm -Bs  =  0 

where  5  =  1,  2,  ...,  771. 

w  >  2m  +  1. 

Since  the  coefficients  As,  B^  derived  from  equations  (7)  are 
obviously  independent  of  one  another,  the  work  may,  if  necessary, 
be  carried  out  by  successive  stages,  until  a  sufficiently  close 
approximation  to  the  series  of  observations  is  obtained.  It  will 
be  shown  that  the  addition  of  a  pair  of  extra  terms  to  the 
series  improves  the  accuracy  of  the  representation  of  the  obser- 
vations by  the  series. 

If  we  start  with  the  assumption 

Z  =  ^o+ ^iCos^4-^2Cos2^)  ,  . 

+  Asin^  +  52sin26')  ^^^' 
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the  coefficients  A^,  A^,  A.,  B,,  Bo  may  be  evaluated  by  means  of 
equations  (7).  If  we  should  then  require  to  carry  the  work 
a  stage  further,  we  might  use  the  equation 

Z  =  ^0  +  ^1  cos  ^  +  ^2  cos  2^  +  ^3  cos  3^1 

+  5i  sin  <9  +  5o  sin  2(9  +  ^3  sin  3l9)  ^  ^• 

The  least  squares  solution  of  equation  (6)  is  the  same  as  that 
of  equation  (a)  as  far  as  the  evaluation  of  Aq,  A^,  A^,  B^,  B^  is 
concerned,  and  the  only  fresh  work  involved  is  the  evaluation 
of  As  and  B^.  These  coefficients  are  obtained  by  the  use  of 
equations  (7). 

The  property  of  independence  of  the  coefficients  leads  to  an 
important  result.  If  we  adopt  equation  (b)  above,  [vv]  is  a 
minimum  when  the  coefficients  in  this  equation  are  deduced  from 
equations  (7).  Hence  [vv]  calculated  from  equation  (b)  is  less 
than  would  be  obtained  by  putting  ^3  =  0,  ^3  =  0 ;  i.e.  it  is 
less  than  [vv]  calculated  from  equation  (a).  It  follows  that  each 
successive  pair  of  terms  added  to  the  Fourier  series  means  a  gain 
in  the  accuracy  with  w^hich  the  observations  are  represented  by 
the  series.  As  we  go  on  adding  terms  to  the  Fourier  series  the 
residuals  diminish  continually,  until,  when  the  number  of  terms 
in  the  series  becomes  equal  to  the  number  of  observations,  the 
residuals  all  vanish,  and  the  representation  of  the  observations  by 
the  series  is  perfect. 

82.     Evaluation  of  [w]. 

The  residuals  are  given  by  the  equations 

%»'  ,         2iTri     ^-^  „    .    27rW 

Vr  +  /r  =  ^0  +    ^    ^i  COS h    ^    i>i  Sin , 

i^i  n        i^i  n 

where  r  =  0,  1,  2,  ...,  7i  — 1. 

Multiplying  each  such  equation  by  the  corresponding  /,.,  and 
remembering  that  [vv]  =  —  [vl\  we  obtain  the  relation 

r—n-l  i  —  m  '^TTTZ 

[vv]  =  [II]  -  Aol^lr—       S  S    IrAi  cos 

r=0      i=l  ^ 

r=n—l  i=m  9»7rW 

-     S       2  l^Bisin-^ (8). 

r=0       i=l  n 

B.  o.  12 
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The  coefficient  of  —  ^^  in  this  equation  is 
Z     tr  cos , 

which  by  equations  (7)  is  ^A-i ;  and  the  coefficient  of  -  Bi  is 

r=^-^  J     .     27rri  n  ^ 

2     Ir  Sin or    -  Bi . 

r=o  n  2 

Thus  equation  (8)  may  be  written 

[^v]  =  [in  -  nA,^  -  I '?  {Ai  +  Bi)   (9). 

Equation  (9)  shows  that  the  addition  of  each  successive  pair 
of  terms  to  the  Fourier  series  decreases  [yv],  while,  as  we  have 
already  seen,  this  in  no  way  affects  the  preceding  terms. 

83.  Case  where  n  terms  are  taken  in  the  Fourier 
Series. 

If  n  be  an  odd  number,  the  greatest  value  of  m  is  given  by 

n      1      n—1 
n  =  Zm  -fl     or     '^^  =  ^  —  ^—  ~^^  • 

The   Fourier   series   can   thus   be    carried   as   far   as   the    terms 

n  —  \ 
involving   — ^-  Q.     The  series  then  gives  an  exact  representation 

of  the  n  observations,  and  all  the  coefficients  can  be  deduced  by 
the  use  of  equations  (7)  above. 

When  n  is  an  even  number,  it  is  not  possible  to  carry  the 
sine  terms  as  far  as  Bn,  since   equations  (7)  yield  zero  for  the 

2 

value  of  B^  when  5  =  ^.     The  sine  terms  must  therefore  stop  at 

z 

Equations  (6)  yield  a  determinate  value  for  A^  and  the  cosine 

2 

series  can  therefore  be  carried  as  far  as  -4^  cos  ^  Q.     Returning 

2  ^ 
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to  equations  (6)  and  putting  s  =  ^,  we  find  that  the  coefficient  of 
An  is     ^     cos^TT?'  or  n,  and  so  A^  is  given  by  the  equation 

7lAn  =  X  (Ir  cos  TTr)  =  S  (-  1)'7, (10). 


Note  the  factor  n  which  replaces  the  ^  of  equations  (7). 

A  slight  modification  has  to  be  made  in  equation  (9).     For,  in 
equation  (8)  the  coefficient  of  A^  is  S(^^.  cosTrr)  or  nAn. 

2  2 

Equation  (9)  then  reads 


71 


[vv'\  =  [ll'\-nA:r-^    2     (Af-^Bf)-nA>^ (11). 

^     «  =  1  2 

It  should  be  noted  that  equations  (10)  and  (11)  are  only 
required  when  it  is  desired  to  obtain  an  accurate  representation 
of  an  even  number  of  observations  by  a  Fourier  series  containing 
as  many  terms  as  there  are  observations.  In  all  other  cases 
equations  (7)  and  (9)  must,  be  used. 


84.     Fourier^s  Theorem. 

When   the   number  of  observations,    n,  becomes    very  large, 
and  we  write 

(9  =  27r-,       dd  =  —  , 
n  n 

equations  (7)  may  be  written 

^s  =  —  2  (^  cos  5^), 

TT 

Bs  =  ~t{l  sin  sO). 

TT 

In   the   limiting   case,  where   n   is   infinite,   dO   becomes   an 

12—2 
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infinitesimally   small    quantity,   and    the    summations    may   be 
written  as  integrals 


ZTT.'o 
1    f'^'^ 

—       I  COS  s6d0 

TTJo 


y (12). 


-1      f2n 

Bs  =  -       I  sin  s6de 


''271 

TT.'o 

If  the  values  of  I  be  given  in  terms  of  the  time  t  as  argu- 
ment, and  if  U  ^nd  t^^  T  be  the  initial  and  final  times,  then 
the  0  of  equations  (12)  is  given  by 


e  =  2ir 


T 


Equations  (12)  correspond  to  the  case  where  the  quantity 
I  is  known  for  an  infinite  number  of  values  of  the  argument 
6,  or  t\  and  if  I  is  finite  for  all  values  of  t  between  t^  and 
^0+^,  the  integrals  of  equations  (12)  will  be  finite  and  single- 
valued.  This  conclusion  will  hold  even  when  I,  expressed  as 
a  function  of  t,  has  a  number  of  finite  discontinuities  within  the 
range  considered.  This  is  equivalent  to  a  statement  of  Fourier's 
Theorem. 

Fourier's  Theorem  states  that  if  ^  is  a  function  of  6  which  is 
finite  and  has  only  a  limited  number  of  maxima  and  minima  and 
of  finite  discontinuities  within  the  range  0  <  ^  <  27r,  it  can  be 
represented  by  a  trigonometric  series 

^0+  S  AsGossd+  2  5s sin 5^ 

(where  the  constants  Aq,  A^,  Ao,  etc.,  are  given  by  equations  (12) 
above)  at  all  points  where  it  is  continuous.  At  points  where  I 
is  discontinuous  the  series  gives  the  arithmetic  mean  of  the  two 
values  of  I  at  the  discontinuity. 

This  Theorem,  so  far  as  it  concerns  the  representation  of  I 
by  the  series  at  points  where  I  is  continuous,  follows  fi'om 
equations  (12)  above.  For  a  full  treatment  of  Fourier's  Theorem 
the  reader  is  referred  to  any  modern  textbook  of  mathematical 
analysis. 
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85.     Evaluation  of  the  Coefficients,     n  a  multiple  of  4. 

The  work  of  evaluating  the  coefficients  is  much  simplified  when 
n  is  a  multiple  of  4.     Let  n  =  4p,  where  jj  is  an  integer.     Then  as 

r  takes  in  succession  all  values  from  0  to  4i)  —  1,  cos and 

^  n 

2  TTTS 

sin each  take  the  same  absolute  value  four  times. 

n 

If  we  write  the  observations  in  order,  thus, 

Iq         i\  '2  •  •  •  '2/;— 1  '22? 

^4^—1         '4^—2  •  •  •  ^22)+l ) 

then  the  sines  which  multiply  corresponding  Z's  have  opposite 
signs,  while  the  cosines  have  the  same  sign,  for  all  values  of  5. 
Adding  the  two  rows,  we  obtain 

CIq      Cli  0/2  ...  Clzp ) 

and  subtracting  them,  we  obtain 

Oi  O2  ...        O2P—1  • 

In  dealing  with  sines,  the  end  terms  drop  out,  since  they  are 
multiplied  by  a  factor  which  becomes  zero  for  all  values  of  s. 
The  normal  equations  for  the  coefficients  may  now  be  written 

2pAi  =  2  (a^  cos  riq),      where  q  =  —  =  9"  » 
2pBi  =  S  (br  sin  riq) 

111 
for  all  values  of  i  from  0  to  ^  —  ^,  and 

4^pAn  =  ^  (a,,  cos  irr)  =  S  (—  l^'a,.. 
2 

But  cos  riq  =  ±  cos  (ltt  —  riq), 

sin  riq  —  +  sin  (tir  —  riq), 

where  the  upper  or  lower  sign  is  taken  according  as  i  is  even 
or  odd. 

We  now  write  the  as  in  the  form 

ao      «!         tts         ...     a^-i       ttp 

d^p        Ct2p—1         ^2p— 2         •  •  •         ^^+1  • 
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Adding  the  columns,  we  obtain 

Cfo       «i       tta       •••       «p-i       «i>j 

and  subtracting, 

©o'     a/     a/     ...     a'^-i. 

Then  it  follows  that  for  even  values  of  i, 

2pAi  =  X  (a,. cos riq),       (r=0,l,2,.,,,  p), 
and  for  odd  values  of  i, 

2pAi  =  %  (a/  cos  riq), 
Avhile  the  first  and  last  coefficients  are  given  by 

^pAn  =  Cfo  -  «!  +  «2,  etc. 

2 

Similarly,  writing 

61        h^     h     ...     bp-i     hp 

and  adding,  we  obtain 

A      /3,  ...    /3p_x    ^p, 

and  subtracting,  we  obtain 

^1       ^2  •••    /3p-i. 

Then  for  even  values  of  i, 

2pBi  =  2  (13;  sin  W^),  (r  =  1,  2,  . . . ,  p  -  1), 
and  for  odd  values  of  ^, 

2pBi  =  t  (^r  sin  riq),     (r  =  1,  2,  3,  ... ,  p). 
The  equations  derived  above  are  collected  for  convenience  of 
reference. 

2^^i  =  0fo'  +  a/cos  q  +  (x;cos2q-{- ..., 
2p^.2  =  ofo  +  «!  cos  2(/ +  ttg  cos  4g  +  . . . , 
2pAs  =  cIq  +  a/  cos  Sq  +  Og'  cos  6g  +  . . . , 
2pAi  =  aQ  +  «!  cos  4g  +  a.,  cos  8g  +  . . . , 


22)5i  =  /8i  sin  g'  +  ySg  sin2g'  + ..., 
2pB2  =  13^  sin  2g  +  /3/  sin  4g  +  . . . , 
2^53  =  /9i  sin  Sq  +  ^2  sin  6g  +  . . . , 
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These  formulae  may  be  used  in  all  cases  where  n  is  a  multiple 
of  4.  In  the  special  case  where  ??  =  12,  or  p  =  o,  q=  30°,  and 
the  expressions  derived  above  contain  at  most  four  terms. 


Example  1.  Find  a  trigonometric  series  which  .shall  represent  accurately 
the  following  series  of  determinations  of  the  monthly  mean  temperature  on 
Ben  Nevis  durinsr  1902. 


Jan.  23°-8  F. 

Feb.  22  -2 

March  25  "8 

April  27-1 

May  27  -6 

June  39  "5 


July  37°-6  F. 

Aug.  38  -0 

Sept.  38  -4 

Oct.  32  -4 

Nov.  30  -0 

Dec.  25  -9 


0  .  23-8 

1  22-2      25-9 


37-6 


25-8      30-0 


27-1      32-4 


27-6      38-4 


39-5      38-0 


a 

b 

a 

23-8 

61-4 

48-1 

-   3-7 

125-6 

55-8 

-    4-2 

121-8 

59-5 
66-0 
77-5 

-    5-3 

-10-8 
1-5 

59-5 

0^  +  02 

=  183-2 

37 -G 

ai-f  as 
=  185-1 

^ 


-13-8  -  2-2 
-29-4  !  -15-0 
-10-2       -    5-3 


=  -3-6         =31 


5-2 
6-6 


We  have       q  =  30° 


cos  30°  =  -866,      cos  60°  =  -5, 
12^0  =  183-2  +  185-1=368-3. 


cos  90°  =  0, 


cos  6 

1 
±•866 
0-5 


a  cos  6 
- 13-8 
+  25-46 
-    5-1 


(Upper  sign)  6^i=  -  44-36. 

(Lower  sign)  6^5=       6-56. 

6i43  =  ao'-a2'= -3*6.  12^0  = 


r        cos  6      a  cos  6 

0  1  61-4 

1  +0-5       +62-8 

2  -0-5       -60-9 

3  +1  +59-5 
(Upper  sign)  6 A 2=     38. 
(Lower  sign)  6^-14=  —2-8. 

01  +  02-03=183-2-185-1=  -1-9. 
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For  the  B's, 

r 

sin  6       (i  sin  6 

r        sin  e      /S'  sin  6 

1 

0-5         -    1-1 

1         0-866     -4-50 

2 

±0-866     +12-99 

2     ±0-866     ±5-71 

3 

1             -    5-3 

[CH. 


(Upper  sign)  QB^=^  -  19-39.  (Upper  sign)  6^2=       1*21. 

(Lower  sign)  6^5=       6-59.  (Lower  sign)  6^4=  -  10-21. 

653=^1-/33  =  3-1. 
The  required  trigonometrical  expression  is  therefore 
Temperature  =  30°-69  -  7°  '39  cos  ^  +  0°  -63  cos  26  -  0° '60  cos  3^ 

-0°-47  cos  4^  +  1° -09  cos  5^-0°-16  cos  %6 
-  3°-23  sin  ^  +  0°-20  sin  2^+0° -52  sin  Z6  - 1°-70  sin  4^ 

+  1°-I0sin5^. 
A  simple  check  on  a  portion  of  the  work  is  obtained  by  putting  ^  =  0  in 
this  expression.     Its  value  is  then 

30-69- 7-394-0-63-0-60- 0-474-1 -09 -0-16  =  23-79, 
which  compares  favourably  with  the  measured  temperature  for  January,  23° -8. 
Again,  since  the  formula  should  give  an  exact  representation  of  the  12 
temperatures,  \yv\  as  derived  from  equation  (11)  should  be  zero. 
\yv\  =  11734-2 - 12  x  (30-69)2-12  x  (0-16)2 

-6  {(7-39)2  4- (0-63)2  4- (0-60)2  4- (0-47)2  4- (1-09)2 

4-  (3-23)2  +  (0-20)2  +  (0-52)2  ^.  (1-70)2  +  (i  -10)2}  =  5  -34. 

This  is  sufficiently  near  to  zero  in  view  of  the  order  of  approximation  to 
which  the  work  has  been  carried  out. 

Example  2.  The  relative  position  of  two  stars  is  measured  as  follows  : 
Two  parallel  wires  are  made  to  pass  each  through  one  star,  and  the 
distances  ^o?  H->  ^2?  •••5 2^11  between  the  two  wires  are  measured  when  the  wires 
make  angles  of  0,  30°,  60°,  90°, . . . ,  330°  with  a  fixed  axis  in  their  plane. 
The  following  12  measures  give  the  distances  -Sq,  z^^  etc.  in  terms  of  one 
division  of  the  micrometer  head;  and  one  division  is  equal  to  0"-4208.  Find 
the  angular  distance  of  the  two  stars. 

00=50-4,  04  =  64-0,  08  =49-6, 

01  =  54-3,  25  =  61-8,  09  =46-2, 

02  =  59-9,  06  =  58-9,  010  =  46-0, 

03  =  62-7,  07  =  53-8,  011  =  47-2. 
It  can  easily  be  shown  that  0  is  given  by 

0r=^o4-^i  cos  6r+^i  sin  0^  where  ^,.=r  .  30°. 

Then  p,  the  angular  distance  between  the  two  stars,  is  given  by 

p2  =  4l2  +  5j2^ 
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and  we  need  only  evaluate  Ai  and  Bi,  following  the  scheme  of  the  preceding 

example. 

Ans.     p  =  9-174  =  3"-86. 

Example  3.  The  following  table  gives  the  monthly  mean  temperatures 
recorded  at  Greenwich  from  1841  to  1890.  Obtain  a  comi)lete  Fourier  series 
to  represent  these  values. 

Jan.        38°-5F.  May    53'-lF.  Sept.    57°-2F. 

Feb.       39  -5  June    59  '4  Oct.      50  -0 

March    41  '7  July    62  -5  Nov.    43  '2 

April      47  -2  Aug.   61  "6  Dec.     39  7 

86.     Further  simplification  when  n  is  a  multiple  of  8. 

If  71  is  a  multiple  of  8,  then  p  in  the  equations  of  page  182 
is  an  even  number.  It  is  then  found  that  the  work  of  evaluating 
the  coefficients  may  be  still  further  shortened,  in  the  case  of  the 
even  harmonics,  while  the  formulae  for  the  odd  harmonics  remain 
unchanged. 

Writing  the  as  in  the  form 


a«-i  ...     o 


^+1' 


and  adding  70      71        72     . . . , 

and  subtracting  7,/     7/       72'    . . . , 

we  can  then  write  the  formulae  for  the  even  A's  thus, 

4p^o=7o  +7i  +7-2  +  ---> 
2pA2  =  7,/  +  7/  cos  2g  +  72'  cos    4^  +  . . . , 
2pAi  =  7o  +  7i  cos  4g  +  72  cos   8^  +  . . . , 
2pAq  =  7o'  +  7i'  cos  6q  +  7.'  cos  12^  +  . . . . 
Again,  writing  the  /3's  in  the  form 


A->i           P> 

..     .,_^ 

P,-.  /3',-.  . 

...     ...^ 

and  adding 

fi        fj 

.  .  .  , 

and  subtracting 

f/        e,'         . 

.  .  , 
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we  can  then  write  the  formulae  for  the  even  B's  thus, 

2pB2  =  61  sin  2q  +  63  sin  4g  +  . . . , 
2pB4  =  61' sin  4>q  -f  eJ  sin  8q  +  ... , 
2pBq  =  61  sin  6q  +  62  sin  12q  +  .... 

87.     Practical  method  of  investigating  periodicities. 

In  practice,  we  should  not  endeavour  to  find  the  amplitude  of  a 
periodic  term  fi-om  observations  extending  over  one  period  only. 
Thus  a  periodic  term  of  one  year  would  only  be  investigated  by 
means  of  observations  extending  over  a  large  number  of  years. 
The  work  is  then  arranged  as  follows. 

Let  Wo,  Uj,  iio,  etc.  be  a  series  of  observations  taken  at  equal 
intervals  of  time  a;  and  suppose  it  is  required  to  investigate 
a  possible  period  pa.  The  ii^  are  written  in  rows  of  p  each, 
as  follows : 

Un  ?ii  lU  ...    li 


1  tto  ...      ll'p—l 


u, 


'P^l        '■^p+'l 

2p+l         '^hp+2    •  •  •     "-^Sp— 1 


•     ^^3?)— 1 


'^(w— 1)  p  '^^np—1 

V,      V,        V,         F^_, 

We  shall  suppose  that  there  are  sufficient  observations  to  form 
n  rows  of  p  each.  The  columns  are  added,  yielding  the  sums 
Vq,  Fi,  ...,  Vp^i.  The  sums  Vq,  Fj,  etc.  are  then  analysed  har- 
monically by  the  methods  outlined  above,  yielding  the  expression 

V=Ao-\-A^cosO  +  B^  sin  6, 

27rr 

where  F  takes  the  value  F-  when  6  = , 

P 

A.  =  -    Z     Vr  COS  —  =  -     Z     Uf  COS  - —  , 
Pr^O  P  P    r=0  P 
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2  ^_ri         .     27rr      2  J"^~'^            27rr 
Bt=~%   v..  sin =  -    X   Ur  cos , 

P  r^O  V  P    r  =  0  P 

1  ^1 
A  =  -  S   F,. 

When  it  is  only  required  to  investigate  a  simple  periodicity 
of  period  p  intervals,  it  is  not  necessary  to  carry  the  Fourier 
expansion  beyond  the  terms  in  cos  6  and  sin  6. 

The  amplitude  in  the  F's  of  the  period  j^a  is 

{A-^  +  B,^)^. 

The  amplitude  of  the  period  jm  in  the  original  observations 
1*0,  "i,  etc.  will  be 

For  if  t  =  cto  +  «!  cos  ^  +  6i  sin  ^  +  . . . , 

27r7' 
where  t  takes  the  value  u^.  for  6  =  - — ,  then  each  of  the  periodic 

p 

terms  is   repeated    exactly   down    the   columns  of  the    table  of 

us.     The  amplitude  of  the  variation  in  the  F's  is  thus  n  times 

the  amplitude  of  the  same  period  in   the    ?/s.     Thus  we   may 

write 

The  right-hand  side  of  this  equation  gives  the  required 
amplitude. 

The  method  has  the  advantage  of  destroying  to  a  certain 
extent  accidental  errors  in  individual  iis.  Periods  of  p  intervals 
or  of  any  exact  sub-multiple  of  this  period  will  be  intensified  ?i-fold 
in  the  sequence  Vq,  V^,  V.2,  etc. ;  while  other  periodicities  will 
tend  to  be  destroyed  by  the  process  of  addition.  For  the  latter 
periodicities  will  occur  with  different  phases  in  the  horizontal 
rows  of  the  table  of  us,  and  the  terms  so  produced  will  tend  to 
annul  one  another  on  addition  of  the  columns,  provided  n  is  large, 
and  the  periods  concerned  be  not  too  nearly  equal  to  the  sub- 
periods  of  p  intervals. 
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The  figures  given  in  Ex.  3,  page  185,  afford  an  example  of  this 
method.  The  temperature  of  38°-5  given  for  January  is  the  mean 
of  the  January  temperature  for  50  years.  This  corresponds  to  the 
method  shown  in  the  table  on  page  186,  except  that  the  F's  are 
all  divided  by  ii,  the  result  being  treated  as  the  mean  annual 
variation.  In  practice  it  will  sometimes  be  found  convenient  to 
analyse  the  quantities  Vq,  Fj,  etc.  as  they  stand,  while  in  other 
cases  it  will  be  found  preferable  to  divide  the  F's  by  n,  treating  the 
result  as  the  mean  variation  of  period  pa. 

Further,  it  should  be  noted  that  if  the  same  quantity  be  added 
to,  or  subtracted  from,  all  the  terms  which  we  desire  to  analyse 
harmonically,  only  the  constant  Aq  is  affected,  the  periodic  terms 
being  unaffected.  It  is  sometimes  useful  to  remember  this  pro- 
perty, as  the  numbers  with  which  we  have  to  deal  can  often  be 
reduced  in  magnitude  by  subtracting  the  same  amount  from  all 
the  terms.  Thus  in  Ex.  3,  p.  185,  we  should  obtain  the  same 
periodic  terms  if  we  subtracted  38°-5  from  all  the  terms  before 
proceeding  further.    - 

Tables  for  facilitating  the  evaluation  of  2  T'",.  cos  r(9,  2 1>  sin  rO  have  been 
drawn  up  by  H.  H.  Turner,  and  published  under  the  title  Tables  for 
Facilitating  the   Use  of  Harmonic  Analysis  (Oxford  Univ.  Press). 

A  large  number  of  instruments  and  graphical  methods  have  been  devised 
for  performing  the  evaluation  of  the  coefficients  of  the  terms  of  the  Fourier 
series  for  any  given  set  of  observations.  For  a  description  of  these  instru- 
ments and  methods,  as  well  as  of  other  instruments,  tables,  etc.,  the  reader 
is  referred  to  the  Handbook  of  the  Napier  Tercentenary  Celebration  of  1914, 
published  by  Messrs  G.  Bell  and  Sons,  under  the  title  of  Modern  Instru- 
ments  of  Calculation.  This  work  is  a  mine  of  information  concerning 
calculating  instruments  of  all  kinds. 

Example  1.  Find  the  first  four  harmonics  in  the  Fourier  series  for  the 
following  24  numbers : 


97 

94 

100 

110 

103 

101 

85 

101 

115 

91 

106 

106 

L02 

113 

102 

105 

99 

112 

81 

100 

106 

94 

83 

94 

In  order  to  decrease  the  numbers  with  which  we  have  to  deal,  it  is 
advisable  to  subtract  80  from  the  numbers  given.  This  only  affects  the 
constant  term,  and  can  be  allowed  for  later. 

The  numbers  are  written  in  the  order  suggested  in  §§  85,  86. 
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0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

17 

14 

20 

30 

23 

21 

5 

21 

35 

11 

26 

26 

22 

14 

3 

14 

26 

20 

1 

32 

19 

25 

22 

33 

a    17 

28 

23 

44 

49 

41 

6 

53 

54 

36 

48 

59 

22 

h 

0 

17 

16 

-3 

1 

4  - 

-11 

16 

-14 

4 

-7 

fl7 

28 

23 

44 

49 

41 

6 

^122 

59 

48 

36 

54 

53 

a     39 

87 

71 

80 

103 

94 

6 

a'  -5  - 

-31 

-25 

8 

-5 

-12 

'  { 

0 

17 

16 

-3 

1 

4 

-7 

4  - 

-14 

16 

-11 

/3 

►7 
—  i 

21 

2 

13 

-10 

4 

/3' 

7 

13 

30 

-19 

12 

1 

2 

3 

(39 

87 

71 

80 

^'U 

13 

30 

"  16 

94 

103 

-19 

y     45 

181 

174 

80 

e   19 

-  6 

30 

y  33 

—  7 

-32 

€    —b 

32 

24^0  = 

^45  +  181  +  1 

L74  +  80  =  - 

i80, 

12^1  = 

3-5 

-31  cos  15° 

-  25  cos  30°  +  8  cos 

$45° 

-  5  cos  60°  - 12  cos  75°  =  -  56*6, 
12  ^2  =  33 -7  cos  30° -32  cos  60°=  11, 
12  ^3=  - 5  -  31  cos  45°  -  25  cos  90°  +8  cos  135° 

-5cosl80°-12cos225°=  -19*1, 
12^4  =  45  +  181  cos60°+174cosl20°  +  80cos  180°= -31-5. 

12^1=  -7  sin  15° +  21  sin  30° +  2  sin  45° +  13  sin  60° 

- 10  sin  75°  +  4  sin  90°  =  15-7, 

12  ^2=  19  sin  30°  -  6  sin  60°  +  30  sin  90°  =  34-3, 

12^3= -7  sin  45° +  21  sin  90° +  2  sin  135° +  13  sin  180° 

-  10  sin  225°  +  4  sin  270°  =  lO'S, 

12^4= -5  sin  60° +  32  sin  120°  =  23-4. 

The  above  figures  yield  Aq  =  20.  In  order  to  correct  for  the  original 
subtraction  of  80  from  all  the  figures  analysed,  we  now  add  80  to  this  value 
of  Aq.     The  Fourier  series  then  reads 

100  -  4-7  cos  e  +  0-9  cos  2(9  -  1  '6  cos  3^-2-6  cos  AO 
+  1-3  sin  6  +  2-8  sin  2<9  -  0-9  sin  Z6  +  2-0  sin  4^, 
where  ^=0,  15°,  30°,  etc.  for  the  first,  second,  and  third  figures  respectively. 

Example  2.  Find  the  first  harmonic  term  in  the  Fourier  series  represent- 
ing the  following  20  quantities  : 


88 

124 

103 

90 

49 

168 

99 

62 

104 

97 

51 

-28 

-1 

42 

55 

59 

99 

38 

122 

37 
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Since  only  the  coefficients  Ai  and  Bi  are  required,  the  work  may  be  con- 
siderably shortened. 

0         1  2         3         4         5         fj  7  8         9       10 

88     124  103       90       49     168       99  62  104       97       51 

37  122       38       99       59       55  42  -1-28 

161  225     128     148     227     154  104  103       69       51 


^  151       69     103     104     154 


I 

a'  (Subtract)  37       92     122  24     -6 

(87    -19  52    -50     109     [44       20     105     125] 

tl25     105  20       44 

/3  (Add)  212       86  72-6     109 

10  .4i  =  37 +92  cos  18° +  122  cos  36° +  24  cos  54° -6  cos  72°  =  235-3, 
10  ^1  =  212  sin  18°  +  86  sin  36°  +  72  sin  54°  -  6  sin  72°  + 109  sin  90°  =  2"; 
Hence  the  required  terms  are 

23-5  cos  (9  +  27-7  sin  (9. 


CHAPTER   XII 

THE    PERIODOGRAM 

88.     Hidden  Periodicities. 

In  certain  classes  of  observations,  the  length  of  the  main  period 
becomes  obvious  as  soon  as  the  observations  are  examined;  e.g. 
the  period  of  the  semi-diurnal  tide  can  be  deduced  from  a  com- 
paratively small  number  of  observations.  Once  the  length  of  the 
period  is  known,  the  methods  of  the  preceding  chapter  can  be 
immediately  applied  to  deduce  the  amplitude  and  phase.  But 
when  the  length  of  the  period  is  unknown,  and  cannot  be  deduced 
in  a  simple  manner,  the  difficulties  of  the  investigation  are 
enormously  increased.  Thus,  if  we  consider  records  of  temperature 
extending  over  a  large  number  of  years,  we  shall  find  that,  once 
the  effect  of  the  annual  period  and  its  harmonics  has  been  removed, 
the  resulting  records  show  no  obvious  periodicity,  though  they  are 
probably  due  to  a  number  of  superposed  periodicities,  with  certain 
accidental  variations  added  on.  In  meteorological  phenomena 
generally,  the  changes  from  day  to  day  appear  to  be  so  arbitrary, 
that  one  is  forced  to  the  conclusion  that,  whatever  periodicities 
may  underlie  the  phenomena,  they  will  be  very  largely  masked  by 
apparently  accidental  variations.  The  methods  considered  in  the 
present  chapter  aim  at  unmasking  such  underijang  j)eriodicities, 
and  determining  their  amplitudes  and  phases. 

It  has  already  been  shown  in  Chapter  XI  that  it  is  always 
possible  to  find  a  Fourier  series  which  shall  represent  with  any 
required  degree  of  accuracy  any  given  set  of  numbers ;  but  it 
must  not  be  assumed  that  the  Fourier  series  is  always  an  accurate 
representation  of  the  physical  laws  underlying  the  phenomena. 
For  it  is  obvious  that  even  a  number  of  quantities  distributed  at 
random  can  be  represented  by  a  Fourier  series  ;  though  in  such  a 
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case  the  use  of  the  series  might  lead  to  the  false  impression  that 
the  phenomena  under  consideration  were  due  to  the  combined 
action  of  a  number  of  purely  periodic  physical  causes.  The  real 
difficulty  lies,  not  so  much  in  finding  a  Fourier  series  to  fit  the 
observations,  as  in  determining  which  of  a  large  number  of 
possible  periods  have  sufficiently  great  amplitudes  to  be  regarded 
as  having  real  physical  significance. 

89.     The  Periodogram. 

The  Periodogram  method  of  searching  for  periodicities  consists 
essentially  in  finding  (by  the  method  of  §  87,  or  otherwise)  the 
amplitudes  of  a  large  number  of  trial  periods.  The  trial  periods 
which  yield  the  greatest  amplitudes  will  yield  approximations  to 
the  most  probable  periodicities.  It  will  be  proved  later  that  if  a 
trial  period  T  should  fall  near  one  of  the  actual  periods  of  the 
observations,  the  resulting  amplitude  R  yielded  by  the  method  of 
§  87  will  be  considerably  greater  than  if  the  trial  period  T  were 
considerably  removed  from  the  actual  periods.  Thus  ii  R^  be 
plotted  for  different  values  of  T,  the  points  of  maxima  of  the  curve 
will  yield  the  most  likely  periods.  If  N  observations  be  used  in 
deriving  R  for  the  period  T,  it  will  be  found  in  some  ways  prefer- 
able to  plot  R'^N  for  different  values  of  T\  but  in  practice  if  we 
had  say  600  readings  to  hand  for  a  periodogram  investigation  we 
should  use  all  or  practically  all  of  the  readings  for  investigating  all 
trial  periods.  Thus  N  will  never  in  practice  vary  considerably, 
and  it  is  sufficient  to  plot  E^  for  different  values  of  T,  The  rather 
ill-defined  maxima  of  this  curve  will  yield  approximations  to  the 
most  likely  periods.  More  accurate  values  of  these  periods  will 
afterwards  be  derived  by  what  is  called  the  Secondary  Analysis 
(§91).  Finally  it  is  necessary  to  consider  how  great  the  amplitude 
of  any  of  these  periods  must  be,  in  order  that  we  may  be  certain 
that  it  did  not  arise  from  a  purely  chance  distribution  of  the 
quantities  considered. 

It  is  of  course  impracticable  to  find  R'^  for  all  possible  values 
of  T,  and  Schuster  has  shown  that,  if  a  period  T  has  been  investi- 
gated, the  nearest  period  to  this  which  need  be  investigated  is  T, 
given  by  the  equation 

n{T-T')=^±lT. 
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This  limit  is  set  by  the  fact  that  any  two  periods  which  may  be 
investigated  should  be  independent,  and  two  near  periods  will 
begin  to  be  independent  when  there  is  a  final  difference  of  phase 
of  a  quarter-period.  This  condition  leads  to  the  above  equation, 
where  n  is  the  number  of  periods  T  used  in  the  investigation. 

90.     Form  of  the  Periodogram  for  one  simple  period. 

Suppose  the  observations  to  be  analysed  could  be  represented 

by  a  single  periodic  function  R'  cos  {/ct  —  e),  where  k  =  -^ ,  T'  being 
the  true  period  of  the  series.     Let  T  be  one  of  the  primary  periods 

27]- 

for  which  the  observations  have  been  analysed,  and  let  T  =  — ■ . 

Then  if  the  number  of  observations  be  large,  the  summations  used 
in  the  usual  harmonic  analysis  can  be  replaced  by  integrations. 
The  coefficients  of  the  first  two  terms  in  the  Fourier  sequence 
then  become* 

ruT 

h  ??  TA 1  =  J?'         cos  {Kt  —  e)  cos  gt  dt, 

.' 0 

uT 


^nTB^  =  R'  \      cos  (/ct  —  e)  sin  gt  dt, 

Jo 

where  n  is  the  total  number  of  complete  periods  T  used  in  the 
analysis.     Integrating  these  expressions,  and  remembering  that 

giiT  =  2-7771, 
we  obtain 

rnT  

^uTA^  =  hR'  I       {cos{k-\-  gt  -  e)  +  cos(/c  -  gt  -  e)\  dt 

0 

=  R'  — ^,sin  h /cnT cos  (k)iT  —  e), 

1  nTB,  =  -R  ^~^-  -^  sin  \ ku T sin  (kuT  -  e). 

Or,  if  R  be  the  amplitude  of  the  period  T  yielded  by  the  primary 
analysis, 

R  4i  .  ,  1 

1d'  =  ti> ^^^ — Tn'^vii\KnT  [kt  cos" (kuT  —  e)  +  q^  sm^- (icnT  —  e)}-. 

K      {K-—g-)nl         ^  I  ^  /      .7  \  /J 

Putting  7  = -|-(/c  —  r/) /i7'=  7r?ir- —  1 

*  Cf.  equations  (12),  page  180. 
B.  o.  13 
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and 

sin  ^KuT  =  sin  { J  (/c  -  g)  iiT  +  ^gnT]  =  sin  (7  +  nir)  =  (-  1)*^  sin  7, 
we  obtain  the  relation 

R      sin  7      2  {k^  cos"  (£7  -  e)  +  gr^  sin^  (27  -  e)}^ 

R'  ry  K^  -^  g 

When  K  and  ^  are  equal  or  very  nearly  equal,  this  reduces  to 

R  _  sin  7 
R'~  ~^' 
Since 

7  =  ■!(/<:  —  ^) nT  =  irn =  ^kiiT  —  ttu  =  irnijpf, —  1\  , 

it  follows  that   only  has  appreciable  values  when  g  —  k  is 

small.     For  the  function has  its  maximum  at   7  =  0,  and 

7 

decreases  to   zero  at  7  =  tt,  after  which   it  has  a  succession  of 
maxima  and  minima  whose  amplitudes  are  small. 
Considering  the  equation 

jR2  _  sin"  7      4  \k^  cos"  (27  -  e)  4-  ^"  sin"  (27  -  e)} 
R^''~~V~^  ( ^  +  gf  "' 

we  see  that  the  second  factor  only  varies  slowly,  so  that  the  curve 

7?- 

representing  the   values   of  -j^^  for  different  values  of  7  is  not 

appreciably  altered,  so  far  as  its  general  form  is  concerned,  when 

R^ 

this   factor   is   neglected.      The    general    distribution   of    -^   is 

R-      sm"  y 
approximately  of  the  form   of  -^77,  =   — —  at  points  near  7  =  0. 

The  curve  will  show  a  broad  band  at  7  =  0,  and  on  each  side 
of  this  band  a  succession  of  other  bands  of  rapidly  decreasing 
intensity.  The  result  is  analogous  with  the  formation  of  diffraction 
bands  (fig.  12).  The  greater  the  total  number  n  of  periods  used, 
the  nearer  will  the  diffraction  bands  crowd  together  on  each 
side  of  the  principal  band,  and  the  narrower  will  the  principal 
band  be. 

It  is  thus  seen  that  a  curve  showing  R^  for  different  values 
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R^  for  different  values  of  T 


of  7,  which  is  of  precisely  the  same  form  as  the  curve  representing 

since  7  =  7r?z  [  ™7  —  1  j    ,  shows  a  broad 

band  at  7  =  0  or  T=T',  and  a  succession  of  subsidiary  bands  of 
small  maximum  ordinate  on  each  side  of  this  band.  Since  it  was 
assumed  in  the  first  place  that  the  period  T'  was  the  only  true 
period,  the  curve  represents  the  effect  of  the  true  period  T'  upon 


377       -27T       -77 


<Nfe 


-is: 

I 


^      77     77     377    277 
V^2       I        2         I 


377 


IT' 


Total  width  of  the  central  band  is . 

n 

Fis.  12. 


the  square  of  the  amplitude  of  other  trial  periods  analysed  by 
the  periodogram ;  and  it  has  been  shown  that  this  effect  is 
negligible  except  for  such  trial  periods  T  as  will  yield  small 
values  of  7.     The  central  band  does  not  spread  beyond 

7=  ±77,    or   'M^-  1  j  =  ±  1. 
The  limits  of  the  central  band  are  therefore 

T=T'(l  ±- 


where  n  is  the  total  number  of  complete  periods  considered  in  the 
investigation. 

13—2 
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Thus  we  were  justified  in  the  assumption  made  in  §  89  that 
the  periodogram  spreads  a  periodicity  into  a  band.  It  has  been 
shown  also  that  the  width  of  the  band  can  be  decreased  by- 
increasing   the   number  of  complete   periods   considered   in  the 

analysis.  But  since  — ^  only  changes  slowly  in  the  neighbour- 
hood of  7  =  0,  the  maximum  shown  in  the  periodogram  will  be 
flat,  and  the  actual  point  of  maximum  will  therefore  be  ill-defined. 
It  must  therefore  not  be  expected  that  the  primary  analysis  of  the 
periodogram  should  yield  the  exact  value  of  the  period. 

91.     Secondary  Analysis. 

When  the  periodogram  has  shown  the  presence  of  a  true 
periodicity  in  the  neighbourhood  of  a  primary  period  T,  the  true 
period  T'  can  be  obtained  by  analysing  harmonically  each  period  T^ 
or  the  sums  of  groups  of  m  periods.  It  will  be  supposed  that  the 
number  of  observations  is  sufficiently  large  to  permit  of  our  replacing 
summations  by  integrals. 

In  the  first  place  suppose  the  observations  Uq,  Ui,  etc.  can  be 
accurately  represented  by  a  simple  period 

u  =  C  -}-  R'  cos  {kI  —  e). 

Let  the  n  complete  periods  be  divided  into  groups  of  m  periods,  and 
let  the  first  two  terms  in  the  harmonic  analysis  of  the  5th  group  be 


Then 


m 


2^ 
AgGo^gt  +  Bs^ingt,   where   g^-jw 


rsmT  rsniT 

TAs  =  I  u  cos gt  dt=  R'  i  cos  (fct  —  e)  cos  gt  dt 

J{s-l)mT  J(s-1)7nT 


rsmT 


cos  {k  +  gt  —  e)  +  cos  (a:  —  ^  ^  —  e)|  dt^ 

-VimT 


sm  (sKml  —  e) 


R'        \fc  +  g     K—  gy 

-  ( + )  sin  (5^1  /cmT  -  e) 

\K-]-g      fc-gj       ^ 

=  — ~  sin  ^K7nT  cos  [i(2s  —  1)  kuiT  —  e]. 
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Similarly 

^,    =  - — ^  sm  i/cmTsm  ^  (2s  -  1)  KmT  -  e  . 
it         K-  —  g-  ^ "  -^ 

If  </)s  be  the  phase  of  maximum  in  this  group 

tan  <^s  =  --  tan  [\  (2s  -  1)  KmT  -  e}. 

It  has  already  been  shown  that  if  T'  be  the  point  of  maximum 
of  the  band  in  the  periodogram,  T  for  different  points  on  the  band 

will    lie    between   T'  \\  ±  -\ .     It   has   been   supposed   that  the 

periodogram  has  shown   that   T'  lies  near  T,  or  that  T  is  well 

within  the  limits  T'  \\  ±  -  ) .     Hence  it  follows  that  since  -  =  ^r , 
\        n)  K      1 

this  fraction  will  differ  from  unity  by  less  than  - .     Since  n  will 

be  fairly  large,  we  may  write  unity  instead  of  -   in  the  above 

equation.     Then 

tan  (/)«=  —  tan  {^  (2s  —  1)  kihT  —  e], 

or  (f>s  =  27rms  —  ^  {2s  —  1 )  KmT  +  e. 

This  is  a  linear  function  of  s,  and   therefore  (j)i,  should  increase 

uniformly  from  group  to  group.     If  /3  be  the  increase  of  phase 

per  group,  then 

^  =  27rm  —  KmT 

T 


=  2irm  (  1  - 


or  r 


1- 


^Trm 

Hence  if  the  progression  of  phase  per  group  can  be  determined 
when  the  observations  are  grouped  m  periods  of  T  together,  and 
the  sum  of  the  groups  analysed,  the  true  period  can  be  deduced 
accurately. 

In  practice  the  work  is  conducted  as  follows.  It  is  supposed 
that  the  primary  analysis  has  shown  the  existence  of  a  true 
periodicity  near  the  trial  period  T,  where  T  is  equal  to  p  of 
the  intervals  between  successive  observations.  The  available 
observations   are    arranged   in   a   table    containing   n   rows   and 
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jp  columns,  as  in  Chap.  XI,  §  87.     The  n  rows  are  divided  up  into 

groups  of  m  each,  and  the  sums  of  the  columns  in  each  group 

written  down.    These  sums  are  then  analysed  harmonically,  either 

by  the  methods  of  §  87  or  by  the  use  of  a  harmonic  analyser. 

Let  the  first   two  terms   in   the    Fourier  sequence  for  the   5th 

row  be 

^sCos^  +  J5ssin^. 

Then  the  phase   (/>«  of  this  row  is  evaluated  by  means  of  the 

formula 

tan  (ps  —  ^' 

The  phase  (f)s  is  evaluated  for  all  the  groups.  The  progression 
from  group  to  group,  /3,  can  then  be  immediately  evaluated.  The 
method  of  procedure  may  perhaps  be  seen  most  easily  by  the 
consideration  of  a  simple  example. 

The  primary  analysis  having  shown  the  existence  of  a  period 
of  19  to  20  months  in  certain  temperature  records,  it  was  required 
to  find  a  more  accurate  estimate  of  the  length  of  the  period.  As 
the  records  conveniently  available  extended  over  about  35  years, 
or  420  months,  the  temperatures  were  written  down  in  periods 
of  20  months,  forming  a  table  of  21  rows.  The  rows  were  then 
arranged  in  groups  of  three  periods  and  added,  yielding  seven 
groups  of  three  periods  each.  The  sums  for  each  group  were  then 
analysed  as  far  as  the  first  harmonic  ,term  only.  The  results  are 
given  in  the  following  table  : 


Group 

^s 

Bs 

Bs 

0s 

1 

235-3 

277-5 

1-18 

50° 

2 

367-8 

-    10-0 

-•027 

-1-5° 

3 

371-4 

53-4 

-144 

8° 

4 

-    44-6 

-124-8 

2-79 

250° 

5 

-206-1 

-116-7 

-57 

209° 

6 

140-8 

-177-8 

-1-27 

-52° 

7 

-327-3 

87-1 

-•27 

165° 
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The  progression  of  phase  from  group  to  group  is  not  obvious 
from  this  table,  but  may  be  evaluated  by  the  following  graphical 
method  (see  fig.  18).     For  each  group  plot  the  phase  of  maximum, 


Fig.  13. 

remembering  that  the  addition  of  any  multiple  of  +  360°  to  the 
phase  makes  no  difference  to  the  trigonometrical  expression.  The 
points  in  the  diagram  group  themselves  about  a  number  of  parallel 
straight  lines,  whose  slope  corresponds  to  a  decrease  of  phase  of 
36'  from  one  group  to  the  next.    The  corrected  period  is  therefore 


20 


1  + 


36 


months  = 


600 
31 


months  =  19*35  months. 


3x360 


92.  Fourier  Series  for  Random  Distribution  of  Obser- 
vations. 

If  the  observed  values  be  distributed  at  random  in  accordance 
with  the  normal  law,  the  number  of  observations  which  will  differ 
from  the  mean  by  an  amount  which  is  between  /3  and  fi-\-dff 

will  be  ^.^  e~  ^'^'  d/3,  where  h  is  a  constant  and  N  is  the  total 

Vtt 

number  of  observations. 

Schuster*  has  shown  that  if  the  observations  be  represented 
by  a  Fourier  series  of  the  form 

u  =  ao{l  +  pi  cos  (6  —  (/)i)  +  /c>2Cos2  (^  — (/).j)  +  ••.}, 

*  Terrestrial  Magnetism,  Vol.  iir,  p.  22. 
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then  the  probability  that  any  coefficient  p  shall  be  between  p  and 
p-\-  dp  is 

Nh-e-i'"'y^pdp. 

Hence  the  probability  that  p  exceeds  z  is 
The  mean  value  of  p^  is 


Nh' 


\\-^^^^'p' p^dp,    or    ^3 


The  probability  that  p^  shall  exceed  ic  times  its  mean  value,  or 

Ik.      . 
-^r^, ,  is  the  same  as  the  probability  that  p  shall  exceed 

and  is  therefore  e~^.  As  this  result  will  be  found  to  be  of 
considerable  importance,  a  table  of  values  of  e~^  is  here 
appended. 


K 

e-'^ 

!                "^ 

e-^ 

1 

•3679 

I 

1         6 

2-4   xlO-3 

2 

•1353 

8 

3-35x10-4 

3 

•0498 

10 

4-54  X  10-5 

4 

•0183 

12 

6-14x10-6 

5 

•00674 

16 

1^13x10-7 

This  table  may  be  interpreted  thus : — The  chance  of  obtaining 
for  the  square  of  a  Fourier  coefficient  a  value  greater  than  three 
times  its  expectancy  or  mean  value  is  '0498,  or  about  1  in  20.  So 
that,  if  on  analysing  a  series  of  observations  we  obtain  a  coefficient 
whose  square  is  more  than  three  times  the  expectancy,  we  can 
state  that  the  probability  that  it  is  produced  by  a  chance  distri- 
bution of  the  quantities  analysed  is  2V-  ^^  ^^®  square  of  the 
Fourier  coefficient  be  1 2  times  its  expectancy,  the  probability  that 
it  is  produced  by  a  chance  distribution  is  1  in  160,000. 
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93.     Application  to  the  Periodogram. 

If  the  amplitude  of  a  period  oi  p  intervals  (i.e.  of  period  T=pa) 
be  evaluated  as  in  §  87,  by  writing  the  observations  in  n  rows  of 
p  each,  the  number  of  observations  used  is  pti.  The  mean  value 
of  R-,  or  its   expectancy  if  the   observations  are  distributed   at 

2 

random,  has  been  shown  above  to  be  — ^,  where  h  is  an  unknown 

nph^ 

•constant.     The  ratio  of  R^  to  its  expectancy  is  — ^ — .     If  we  let 

S  =  R^pn, 
and  draw  a  curve  to  represent  the  value  of  S  for  different  values 
of  the  period  T,  where  'T  =  pa,  then  the  ordinate  S  will  always  be 
proportional  to  the  ratio  of  R^  to  its  expectancy.    The  area  between 
the  curve  and  the  axis  of  T  is  called  the  periodogram. 

If  the  total  number  of  observations  used  in  investigating  all 
the  periods  (i.e.  pn)  should  be  sensibly  the  same,  then  we  may 
define  the  ordinate  S  by  the  equation 

S  =  R\ 
In  practice  p?i  will  not  vary  within  wide  limits,  and  whichever 
definition  of  S  we  take,  we  may  say  that  the  points  of  maxima  of  >S^ 
yield  the  most  likely  periods. 

If  the  period  of  p  intervals  investigated  above  were  not  real, 
there  would  be  no  simple  phase-relation  between  the  terms  in  the 
same  column  of  the  table  of  us  in  §  87,  and  the  sums  of  the 
columns,  V^,  Vj,  V^,  etc.,  might  be  regarded  as  being  independent 
of  each  other,  or,  in  short,  might  be  regarded  as  a  random  distri- 
bution. So  that  if  R"-  be  evaluated  for  a  range  of  values  of  T  for 
which  no  real  period  exists,  its  value  should  on  the  whole  be  equal 
to  its  expectancy  on  the  hypothesis  of  a  random  distribution  of 
the  quantities  analysed.  For  such  a  range  of  values  of  T,  the 
ordinate  >S^  of  the  periodogram  should  show  a  random  distribution 
about  a  mean  value  S,  independent  of  the  value  of  T.  It  is 
generally  found  in  practice  that  the  periodogram  shows  regions 
where  8  is  relatively  small,  and  the  curve  has  no  clearly-defined 
maxima ;  and  this  may  be  assumed  to  show  the  absence  of  any 
periodicities  within  this  range.  The  mean  value  of  S  over  such 
a  region  will  yield  the  value  of  8,  corresponding  to  the  fact  that 
on  the  whole  7^"^  is  equal  to  its  expectancy. 
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If  aS^  be  a  maximum  ordinate  of  the  periodogram,  corresponding- 

to  the  period  T,  it  follows  that  the  ratio  of  R^  to  its  expectancy 

Sf 
is  —.     Hence  the  probability  that  R-  is  produced  by  a  random 

S 

_s 

distribution  is  e    s .     If  we  are  satisfied  with  odds  of  1000  to  1  in 
favour  of  the  period  T  being  real,  reference  to  the  table  of  values 

of  e~'^  given  above   shows  that  -=  must  be  at  least  equal  to  7. 

S 

o 

If  we  are  only  satisfied  with  odds  of  160,000  to  1,  ^  must  be  at 

S 

least  equal  to  12. 

Thus  the  periodogram  itself  will  yield  a  test  of  the  reality 

of  the  periods  corresponding  to  the  maxima  of  the  curve.     When 

the  number  j9?i  of  observations  used  in  investigating  periods  of 

different  length  is  not  subject  to  any  considerable  variations,  the 

ordinate  of  the  periodogram  may  be   taken  equal  to  R"^.     It  is 

only  when  p7i  varies  considerably  for  different  periods  that  it  is 

necessary  to  include  pn  as  a  factor,  and  define  the  ordinate  of  the 

periodogram  as 

S  =  R'pn. 

94.     Schuster's  Investigations. 

The  periodogram  in  its  original  form  was  developed  by  Schuster 
in  a  paper  entitled*  "The  Investigation  of  Hidden  Periodicities." 
In  this  paper  the  ordinate  of  the  periodogram  corresponding  to 
a  period  T  was  defined  as  the  amplitude  derived  for  that  period. 
With  our  previous  notation,  S  =  R. 

In  a  later  paperf,  Schuster  gave  a  further  analytical  develop- 
ment of  the  periodogram,  taking  the  ordinate  to  be  the  square  of 
the  amjDlitude,  or  S=  R^. 

In  a  third  paper  |,  Schuster  gave  a  detailed  development  of  the 
analogy  between  the  periodogram  and  the  distribution  of  energy 
in  a  bright  lined  spectrum  produced  by  a  simple  grating.  The 
ordinate  S  of  the  periodogram  was  there  defined  as  ^R^pna,  using 
the  notation  of  the  present  chapter. 

*  Terrestrial  Magnetism,  Vol.  iii,  p.  22. 

t  Camb.  Phil.  Soc.  Trans.,  Vol.  xviii,  p.  107. 

X  Proc.  Roy.  Soc,  Vol.  lxxvii,  p.  136. 
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Schuster  applied  the  method  of  the  third  paper  to  the  dis- 
cussion of  the  periodicities  of  sunspots  *.  In  the  course  of  this 
work  the  material  accumulated  during  150  years  was  used,  and 
the  periodogram  was  drawn  for  values  of  T  varying  from  55  days 
to  24  years.  The  whole  range  of  150  years  was  then  divided  into 
two  parts,  of  75  years  each,  and  the  periodogram  for  each  of  these 
parts  drawn  separately.  A  very  striking  result  was  obtained. 
The  two  curves  bore  not  the  slightest  resemblance  to  each  other. 
During  the  first  interval  of  75  years  the  principal  periods  were 
13f  years  and  OJ  years,  while  during  the  second  interval,  up  to 
1900,  the  lli  year  period  predominated.  A  number  of  other 
fairly  clearly-defined  periods  were  also  discussed. 

The  whole  discussion  points  to  the  conclusion  that  in  such 
observational  data  as  are  generally  treated  by  the  periodogram 
method,  we  meet  with  a  new  type  of  periodicity  which  is  ill- 
defined,  and  not  always  persistent  in  amplitude  or  phase.  Outside 
the  domain  of  gravitational  astronomy  one  seldom  finds  i.'^olated 
and  clearly-defined  periodicities.  In  the  periodogram  the  problem 
is  often  complicated  by  the  presence  of  not  merely  one  or  two 
well-marked  periods,  but  of  a  large  number  of  almost  coincident 
periods,  so  that  the  periodogram  may  then  be  compared  to  a 
spectrum  full  of  closely-crowded  lines.  In  such  a  case,  a  high 
resolving  power  is  necessary  in  order  to  separate  out  the  slightly 
differing  periods ;  i.e.  the  observations  used  in  the  discussion  must 
extend  over  as  long  a  period  as  possible. 

95.     Whittaker^s  variant  of  Schuster's  method. 

In  a  discussion  of  the  variations  of  /SaS  Cygnif,  Gibb  applies 
a  method  suggested  by  Whittaker,  the  essential  point  of  which  is 
to  defer  the  harmonic  analysis  until  a  late  stage  of  the  work.  To 
examine  the  observed  quantities  v^,  ii^,  n^,  etc.  for  a  period  of 
p  intervals  we  arrange  them  as  before  in  n  rows  of  p  each,  and  add 
the  columns,  forming  the  sums  Fo,  V^,  Vo,  etc.  Then,  instead  of 
applying  Fourier's  analysis  to  the  F's,  we  find  the  difference 
between  the  greatest  and  least  of  these  quantities.  This  is  a  rough 
measure  of  the  amplitude  of  the  period  of  ^;  intervals. 

*  Phil.  Trans.  Roy.  Soc,  206  a,  pp.  69—100. 
t  Monthly  Notices,  R.A.S.,  Vol.  lxxiv,  p.  678. 
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Let  y  be  the  difference  between  the  greatest  and  the  least  of 
the  F's  for  a  trial  period  x  (where  x  =  poL).  Draw  a  graph  with 
X  as  abscissa  and  y  as  ordinate.  The  resulting  curve  may  be 
called  the  "curve  of  periods."  It  shows  high  peaks  at  the  points 
at  which  real  periodicities  exist,  and  lower  peaks  at  points  cor- 
responding to  doubtful  periodicities.  Such  a  curve  of  periods 
indicates  the  range  of  possible  periods  which  can  be  usefully 
explored  by  a  detailed  periodogram  analysis,  and  forms  a  useful 
preliminary  to  the  application  of  Schuster's  method. 

96.     Use  of  the  Complete  Fourier  Sequence. 

Turner*  has  suggested  that  the  work  of  the  periodogram 
might  be  more  economically  performed  by  the  evaluation  of  the 
complete  Fourier  sequence  of  exact  sub-multiples  of  the  whole 
range  of  time  covered  by  the  observations.  When  the  figures 
analysed  form  a  regular  series,  with  no  sudden  jumps  between  two 
consecutive  values,  it  may  be  expected  that  the  later  coefficients 
in  the  Fourier  sequence  will  be  small.  The  observations  can 
then  be  completely  represented  (with  considerable  accuracy)  by 
a  limited  number  of  terms,  and,  theoretically  at  any  rate,  the 
Fourier  sequence  should  contain  within  itself  all  the  properties 
of  the  original  observations,  including  the  periodicities  of  those 
observations. 

In  the  first  place  we  must  consider  the  effect  of  a  period  of  the 
original  observations  whose  length  falls  between  two  consecutive 
sub-multiples  of  the  total  range  of  time  covered  by  the  observations, 
and  so  falls  between  the  periods  represented  by  two  consecutive 
terms  of  the  Fourier  sequence.  It  might  be  anticipated  a  priori 
that  such  a  true  period  should  affect  most  those  terms  in  the 
Fourier  sequence  whose  periods  most  nearly  coincide  with  its  own. 
This  can  easily  be  proved  by  simple  calculation.  Let  the  true 
period  be  represented  by  a  single  term,  with  unit  amplitude 

sin(^^  +  h). 

Let  the  resulting  Fourier  sequence  be 

2  Aq  cos  qt-\-  ^  Bq  sin  qt. 

q=0  q=l 

It  will  be  assumed  that  the  number  of  observations  is  sufficiently 

*  MontJdy  Notices,  R.A.S.,  Vol.  lxxiii,  pp.  549,  714. 
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large  to  permit  of  our  replacing  summations  by  integrations.  This 
simplifies  the  work  considerably,  and  does  not  in  any  way  affect 
the  conclusions  arrived  at.     We  then  have 


'  0 


=  \     sin(pt  +  8)  sin  qt  dt 

Jo 


9/ 


p^-q2 


sin  TTp  cos  (irp  +  S), 


'7rAq=  j      sin  ( pt  +  S)  COS  qt  dt 

Jo 

_     2p 


p^  —  q^ 


sin  irp  sin  {irp  +  h). 


If  p  lies  between  q  and  g  +  1,  it  is  clear  that  p)'^  —  q^,  and  con- 
sequently A  and  B,  change  sign  as  we  go  from  q  to  q-\-l.  This 
change  of  sign  affords  a  new  criterion  for  evaluating  periodicities, 
but  is  strictly  valid  only  when  the  periodicity  considered  is  isolated. 
Again  the  values  of  A  and  B  will  be  greatest  for  those  values  of  q 
which  yield  the  least  values  of  p-  —  q-.  Thus  if  there  be  an  isolated 
periodicity  between  q  and  ^  +  1,  the  coefficients  Aq,  ^5+1,  Bq,  Bq+i 
will  be  greater  than  any  of  the  other  coefficients  in  their  neighbour- 
hood. 

Let  p  =  q  +  X. 

mi  A  2»    sinTT^        ,  ^^ 

Then  ttAq  =  — ~ cos  (ttx  +  b), 

^     p+q      X  ^  ' 

.                      —2p        SUiTTX  .  5., 

TT^^+l  =  — ^— -        cos  {tTX  +  h), 


and 


Aq     _        1-X  _     Bq 


A  q^i  X  J^q-tl 


W 


hen  q  is  large.  Either  of  the  two  expressions  will  yield  the  value 
of  X.  This  value  is  derived  on  the  assumption  that  q  is  large,  but 
this  condition  will  generally  hold  for  any  period  worthy  of  serious 
consideration.  If  Rq  be  the  amplitude  of  the  q\h  period  in  the 
Fourier  sequence, 

R(^  =  Aq-  -{-  Bq^   and   Rq  =  (-  ly  — —  ,  approximately. 

irx 

This  result  may  be  compared  with  the  equation  ^  = of  §  90. 
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The  following  table,  extracted  from  Turner's  Tables  for 
Harmonic  Analysis,  shows  the  effect  of  a  periodic  term  of 
unit  amplitude,  for  which  p  =  q-i^ x,  on  the  coefficients  of  the 
neighbouring  terms  in  the  series. 


Divisor 

q-2 

?-l 

Q 

2  +  1 

?  +  2 

3  +  3 

Factor 

sin  ir.v 

sin  irx 

sinirx 

simra: 

sinTra; 
7r(2-x) 

sinTra; 

^^{x  +  2) 

^  7r(x+l) 

irx 

7r(l-x) 

7r(3-a;) 

^  =  0 

•00 

•00 

TOO 

•00 

•00 

•00 

.r  =  i 

•07 

•14 

•95 

-•19 

-•09 

-•06 

^-l 

•12 

•21 

•82 

-•41 

-'\Q 

-•10 

x  =  h 

•13 

•21 

•63 

-•63 

-•21 

-•13 

x  =  l 

•10 

•16 

•41 

-•82 

-•21 

-•12 

x  =  l 

•06 

•09 

•19 

-•95 

-•14 

-•07 

x  =  \ 

•00 

•00 

•00 

-TOO 

•00 

•00 

sm  iri) 
This  table  ofives  the  value  of  the  factor  — -, ^  in  the  values 

for  An  and  Ba  Sfiven  above,  the  factor  — ^—  or  — ^  beino-  treated 
q  9  6  p^q        2)-\-q  ° 

as  unity. 

It  is  clear  from  the  table  that  a  single  periodicity  can  only 

yield  large  values  to  a  few  coefficients  in  the  Fourier  series.     This 

agrees  with  the   hump  in  the  periodogram  which  occurs  in  the 

neighbourhood  of  a  true  period.     Thus  the  Fourier  series  ought 

in  theory  to  yield  the  values  of  the  true  periodicities  with  less 

work   than  Schuster's  method  demands.     For  the  occurrence  of 

two  or  three  large  coefficients  in  the  Fourier  sequence  indicates 

the  presence  of  a  true  period  between  the  two  terms  which  have 

the  largest  coefficients,  and  the  equation 

1-X  An  Bn 


^  Aq+l  Bq+i 

should  give  the  accurate  value  of  x,  replacing  the  secondary  analysis 
of  Schuster's  method. 
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In  practice,  however,  the  matter  is  by  no  means  so  simple,  as 
may  be  seen  by  considering  the  following  extract  from  Turner's 
table  of  the  Fourier  sequence  for  Wolf's  sunspot  numbers 
(156  years). 


Divisor 

Period 

A 

B 

R 

12 

13-0  years 

4-8 

5-9 

7-6 

13 

12-0 

3-1 

14-7 

15-0 

14 

11-14 

15-8 

-19-6 

25-8 

15 

10-40 

-4-8 

-13-2 

14-0 

16 

9-75 

13-6 

0-4 

18-6 

17 

9-20 

3-7 

6-5 

7-5 

The  largest  value  of  i^  is  at  ^  =  14,  showing  the  existence  of  a  true 
period  near  11-14  years,  near  the  supposedly  well-known  11  year 
period.  According  to  the  theory  given  above  there  should  be 
a  change  of  sign  of  both  A  and  B  on  oue  side  of  ^  =  14,  but  not 
on  the  other.  But  the  table  shows  a  change  of  sign  of  A  from 
^  =  14  to  g  =  15,  and  of  B  from  ^  =  13  to  q  =  14!.  It  is  thus 
probable  that  more  than  one  real  periodicity  falls  within  the  range 
of  the  table  given  above. 

When  the  changes  of  sign  of  A  and  B  agree,  it  will  generally 
be  found  that  the  values  of  a;  deduced  from  the  A's  and  the  B's 
differ.     In  such  a  case  it  is  perhaps  better  to  use  the  equation 

1  —  CC  _        Aq±Bq 
^  Aqj^l  +  -D5-I-1 

Take,  for  example,  the  following  extract  from  the  Fourier 
sequence  for  the  Greenwich  temperatures: 


? 

A 

B 

R 

22 

•4 

1-0 

1-0 

23 

-3-0 

4-3 

5-2 

24 

'7 

-2:8 

2-9 

\-x 

7-3 

X 

3-5' 

X  ■ 

3-5 
10-8 

•324 

208  THE   PERIODOGRAM  [CH. 

Putting 

we  find 

therefore  'p  =  23-324. 

When  a  large  number  of  consecutive  terms  of  the  Fourier 
sequence  are  large,  or  the  law  of  signs  of  A  and  B  is  not  followed, 
it  is  generally  safe  to  conclude  that  a  number  of  periodicities  are 
involved.  In  such  a  case  it  is  best  to  examine  the  observations 
for  possible  discontinuities. 

97.     The  Investigation  of  Discontinuities*. 

In  order  to  investigate  the  possible  discontinuities  of  the 
11  year  sunspot  period,  Turner  arranged  the  sunspot  numbers 
in  periods  of  12  years.  The  coefficients  A,  B  were  evaluated  for 
years  I  to  12.  Then  the  year  13  was  substituted  for  the  year  1, 
and  the  coefficients  A,  B  were  re-calculated.  Next  the  year  14 
was  substituted  for  the  year  2 ;  and  the  process  was  continued 
until  the  whole  series  of  sunspot  numbers  had  been  used  up. 
The  work  can  be  carried  out  in  the  following  way.  In  Table  I  are 
given  the  Wolf  sunspot  numbers  arranged  in  periods  of  12  years. 
In  Table  II  the  first  row  is  identical  with  the  first  row  in  Table  I,. 
the  second  row  is  the  difference  of  row  2  and  row  1  of  Table  I,  etc. 


Table 

I. 

Year 

1749 

81 

83 

48 

48 

31    12 

10 

10 

32 

48 

54 

63 

1761 

86 

61 

45 

36 

21    11 

38 

70 

106 

101 

82 

66 

1773 

35 

31 

7 

20 

92   154 

126 

85 

68 

38 

23 

10 

etc. 

etc. 
Table 

II. 

1749 

81 

83 

48 

48 

31    12 

10 

10 

32 

48 

54 

63 

1761 

5 

-22 

-  3 

-12 

-10   -1 

28 

60 

74 

53 

28 

3 

1773 

-51 

-30 

-38 

-16 

71   143 

88 

15 

-38 

-63 

-59 

-56 

etc. 

etc. 

Table 

III. 

Products  for  cosines. 

1749 

81 

72 

24 

0 

-16   -10 

-10 

-  9 

-16 

0 

27 

55 

1761 

5 

-19 

-  2 

0 

5     1 

-28 

-52 

-37 

0 

14 

3 

1773 

-51 

-26 

-19 

0 

-36  -124 

-88 

-13 

19 

0 

-30 

-48 

Turner,  Monthly  Notices,  JR.A.S.,  VoL  lxxiv,  p.  82. 
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Table  IV. 

Products  for  sines. 

Year 

1749 

0 

42 

41 

48       27           6           0-5 

-28 

-48 

-47 

-32 

1761 

0 

-11 

-    3 

-12-9           0           0      -30 

-65 

-53 

-24 

_    2 

1773 

0 

-15 

-33 

-16       62         72           0-8 

33 

63 

51 

28 

Table  III  is  derived  by  multiplying  the  columns  of  Table  II  by 
the  appropriate  factors  cos  0°,  cos  30°,  cos  60',  etc.  The  factors  are 
1,  -866,  -5,  0,  --5,  --866,  -1,  -'866,  -'5,  0,  '5,  '866.  Similarly 
Table  IV  is  derived  by  multiplying  the  columns  of  Table  II  by 
the  appropriate  factors  sin  0°,  sin  30°,  sin  60°,  etc.  These  factors 
are  0,  '5,  -866,  1,  '866,  '5,  0,  -  '5,  -  -866,  -  1,  -  -866,  -  '5. 

The  difference  between  A,  B  for  the  12  year  interval  starting  at 
1749,  and  the  interval  starting  at  1750,  is  due  to  the  substitution 
of  86  for  81  in  the  first  term.  The  difference,  5,  is  given  in 
Table  II,  and  is  multiplied  by  the  appropriate  factors,  1  and  0, 
in  Tables  III  and  IV.  To  find  A,  B,  for  the  12  year  interval 
starting  at  1751,  we  must  substitute  61  for  83  in  the  second  term. 
This  is  the  difference  of  —  22  given  in  Table  II,  and  multiplied 
by  the  appropriate  factors  in  Tables  III  and  IV.  The  process  of 
evaluation  of  successive  ^'s  and  B's  is  thus  reduced  to  the 
addition  of  successive  terms  of  Tables  III  and  IV. 

Thus  we  obtain  a  series  of  values  of  A  and  B.  The  phase  (f)  is 
deduced  from  the  formula 

,      B 

tan  9  =  "T  . 

From  the  progression  of  the  values  of  (/>  the  accurate  period 
can  be  evaluated  as  from  fig.  13.  In  the  work  on  the  sunspot 
numbers.  Turner  found  that  the  progression  of  phase  underwent 
sudden  changes  near  the  years  1766,  1796,  1838,  1868,  and  1895, 
while  in  the  intervals  between  these  breaks  the  progression  of 
phase  was  regular.  Thus  from  1749  to  1766  the  phase  could  be 
represented  by  a  formula 

(f>  =  const.  —S°.t, 

where  t  is  measured  in  years.     This  corresponds  to  a  period  of 

12 

TTTsG  years  =  10-9  years. 

B.  o.  14 
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Again  from  about  1760  to  1796  the  phase  could  be  represented 
with  fair  accuracy  by  a  formula 

(j)  =  const.  —  8-3° .  t, 

corresponding  to  a  period  of 

12 


1  _|-  12x8" 


years,  or  9*37  years. 


360 

Turner's  results  show  clearly  that  the  predominating  periodicity 
changes  at  intervals,  bearing  out  the  results  previously  derived  by 
Schuster. 

Turner's  method  of  investigating  discontinuities  in  the  main 
periodicities  is  to  be  recommended  in  all  periodogram  work.  It 
affords  the  surest  method  of  detecting  changes  in  the  relative 
importance  of  different  periodicities  from  time  to  time.  The  labour 
involved  is  not  prohibitive,  particularly  when  it  is  considered  in 
conjunction  with  the  importance  of  its  results. 

98.  An  Application  of  the  Fourier  Sequence  in 
Periodogram  Work. 

The  following  brief  details  of  an  analysis  of  the  Greenwich 
Temperature  records  undertaken  by  the  present  writer,  but  not 
yet  completed,  may  help  to  illustrate  the  method  of  using  the 
complete  Fourier  sequence.  The  records  used  in  the  hrst  analysis 
extended  from  1841  to  1890,  and  the  monthly  mean  temperatures 
were  used.  The  effect  of  the  annual  period  and  all  its  submultiples 
was  removed  from  the  records  by  subtracting  from  each  monthly 
mean  the  mean  temperature  of  the  corresponding  month  during 
the  whole  interval  of  50  years.  These  quantities  subtracted  were 
analysed  separately  (see  Ex.  3,  page  185).  In  order  to  avoid 
negative  signs  as  far  as  possible,  10°  was  added  to  the  figures 
obtained  after  subtracting  the  general  mean  for  the  whole  period 
from  the  monthly  means.  The  figures  so  obtained,  expressed  in 
units  of  '1°,  were  taken  as  the  material  for  the  periodogram 
analysis.     There  were  12  x  50  =  600  figures. 

Adding  together  the  figures  for  25  consecutive  months,  and 
dividing  by  25,  we  obtain  24  terms,  each  of  which  is  the  mean 
deviation  for  25  months.  These  figures  were  analysed  by  the 
usual  method  up  to  the  4th  harmonic  (see  Ex.  ],  p.  188).     Next, 
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the  figures  for  10  consecutive  months  were  added,  yielding  60  terms, 
and  these  were  analysed  for  the  5th,  6th, ...  up  to  the  1 4th  harmonics. 
The  first  14  periods  of  the  Fourier  sequence  were  thus  obtained, 
except  that  they  needed  a  correction  of  phase,  the  two  sets  of 
terms  being  determined  with  regard  to  different  origins  of  time. 

Certain  sets  of  coefficients  were  capable  of  fairly  easy  evaluation. 
Thus  the  period  of  40  months  and  its  submultiples  were  obtained 
by  writing  down  in  a  table  the  15  periods  of  40  months,  and 
taking  the  means  of  the  columns.  The  analysis  of  these  means 
yielded  the  terms  q  =  15,  30,  45,  60,  etc.  in  the  Fourier  sequence. 

The  terms  q=20,  40,  60,  etc.,  and  the  terms  q  =  2b,  75,  etc., 
were  also  evaluated  in  a  similar  manner.  The  remainder  of  the 
terms  up  to  5  =  49  were  calculated  by  a  straightforward  application 
of  the  formulae  of  §§  85,  86,  though  this  involved  considerable 
labour. 

The  resulting  Fourier  sequence  shows  a  number  of  fairly 
clearly  marked  periodicities,  but  a  closer  examination  reveals 
evidences  of  discontinuities. 

The  figures  obtained  when  the  general  mean  for  50  years  for 
any  one  month  is  subtracted  from  the  corresponding  monthly 
means,  do  not  form  a  regular  series  as  do  the  sunspot  numbers. 
It  cannot  be  expected  that  the  coefficients  of  the  later  harmonics 
in  the  series  will  become  vanishingly  small.  The  apparent  chance 
distribution  of  the  figures  analysed  will  tend  to  give  fairly  large 
values  of  the  later  coefficients  even  when  there  are  no  real 
periodicities  in  that  region.  In  other  words,  the  mean  value  of 
R^  will  be  large  in  regions  where  no  real  periodicity  exists,  and 
so  it  will  be  necessary  to  consider  very  carefully  whether  the 
apparent  periods  obtained  may  not  be  due  to  a  chance  distribution 
of  the  figures  analysed. 
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